<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en-us"><generator uri="https://gohugo.io/" version="0.157.0">Hugo</generator><title type="html">baby steps</title><link href="https://smallcultfollowing.com/babysteps/" rel="alternate" type="text/html" title="html"/><link href="https://smallcultfollowing.com/babysteps/index.xml" rel="alternate" type="application/rss+xml" title="rss"/><link href="https://smallcultfollowing.com/babysteps/atom.xml" rel="self" type="application/atom+xml" title="atom"/><updated>2026-02-27T10:46:31+00:00</updated><author><name>Niko Matsakis</name><email>rust@nikomatsakis.com</email></author><id>https://smallcultfollowing.com/babysteps/</id><entry><title type="html">How Dada enables internal references</title><link href="https://smallcultfollowing.com/babysteps/blog/2026/02/27/dada-internal-references/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2026/02/27/dada-internal-references/</id><published>2026-02-27T00:00:00+00:00</published><updated>2026-02-27T05:20:38-05:00</updated><content type="html"><![CDATA[<p>In my previous Dada blog post, I talked about how Dada enables composable sharing. Today I&rsquo;m going to start diving into Dada&rsquo;s <em>permission</em> system; permissions are Dada&rsquo;s equivalent to Rust&rsquo;s borrow checker.</p>
<h2 id="goal-richer-place-based-permissions">Goal: richer, place-based permissions</h2>
<p>Dada aims to exceed Rust&rsquo;s capabilities by using place-based permissions. Dada lets you write functions and types that capture both a <em>value</em> and <em>things borrowed from that value</em>.</p>
<p>As a fun example, imagine you are writing some Rust code to process a comma-separated list, just looking for entries of length 5 or more:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">list</span>: <span class="nb">String</span> <span class="o">=</span><span class="w"> </span><span class="fm">format!</span><span class="p">(</span><span class="s">&#34;...something big, with commas...&#34;</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">items</span>: <span class="nb">Vec</span><span class="o">&lt;&amp;</span><span class="kt">str</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">list</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">split</span><span class="p">(</span><span class="s">&#34;,&#34;</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">map</span><span class="p">(</span><span class="o">|</span><span class="n">s</span><span class="o">|</span><span class="w"> </span><span class="n">s</span><span class="p">.</span><span class="n">trim</span><span class="p">())</span><span class="w"> </span><span class="c1">// strip whitespace
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">filter</span><span class="p">(</span><span class="o">|</span><span class="n">s</span><span class="o">|</span><span class="w"> </span><span class="n">s</span><span class="p">.</span><span class="n">len</span><span class="p">()</span><span class="w"> </span><span class="o">&gt;</span><span class="w"> </span><span class="mi">5</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">collect</span><span class="p">();</span><span class="w">
</span></span></span></code></pre></div><p>One of the cool things about Rust is how this code looks a lot like some high-level language like Python or JavaScript, but in those languages the <code>split</code> call is going to be doing a lot of work, since it will have to allocate tons of small strings, copying out the data. But in Rust the <code>&amp;str</code> values are just pointers into the original string and so <code>split</code> is very cheap. I love this.</p>
<p>On the other hand, suppose you want to package up some of those values, along with the backing string, and send them to another thread to be processed. You might think you can just make a struct like so&hellip;</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Message</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">list</span>: <span class="nb">String</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">items</span>: <span class="nb">Vec</span><span class="o">&lt;&amp;</span><span class="kt">str</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//         ----
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// goal is to hold a reference
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// to strings from list
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>&hellip;and then create the list and items and store them into it:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">list</span>: <span class="nb">String</span> <span class="o">=</span><span class="w"> </span><span class="fm">format!</span><span class="p">(</span><span class="s">&#34;...something big, with commas...&#34;</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">items</span>: <span class="nb">Vec</span><span class="o">&lt;&amp;</span><span class="kt">str</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="cm">/* as before */</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">message</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Message</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">list</span><span class="p">,</span><span class="w"> </span><span class="n">items</span><span class="w"> </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">//                      ----
</span></span></span><span class="line"><span class="cl"><span class="c1">//                        |
</span></span></span><span class="line"><span class="cl"><span class="c1">// This *moves* `list` into the struct.
</span></span></span><span class="line"><span class="cl"><span class="c1">// That in turn invalidates `items`, which 
</span></span></span><span class="line"><span class="cl"><span class="c1">// is borrowed from `list`, so there is no
</span></span></span><span class="line"><span class="cl"><span class="c1">// way to construct `Message`.
</span></span></span></code></pre></div><p>But as experienced Rustaceans know, this will not work. When you have borrowed data like an <code>&amp;str</code>, that data cannot be moved. If you want to handle a case like this, you need to convert from <code>&amp;str</code> into sending indices, owned strings, or some other solution. Argh!</p>
<h2 id="dadas-permissions-use-places-not-lifetimes">Dada&rsquo;s permissions use <em>places</em>, not <em>lifetimes</em></h2>
<p>Dada does things a bit differently. The first thing is that, when you create a reference, the resulting type names the <em>place that the data was borrowed from</em>, not the <em>lifetime of the reference</em>. So the type annotation for <code>items</code> would say <code>ref[list] String</code><sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> (at least, if you wanted to write out the full details rather than leaving it to the type inferencer):</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give,shared,share,ref" data-dada-types="String,Point,Vec,Map,Character,u32">let list: given String = &#34;...something big, with commas...&#34;
let items: given Vec[ref[list] String] = list
    .split(&#34;,&#34;)
    .map(_.trim()) // strip whitespace
    .filter(_.len() &gt; 5)
    //      ------- I *think* this is the syntax I want for closures?
    //              I forget what I had in mind, it&#39;s not implemented.
    .collect()</code></pre>
<p>I&rsquo;ve blogged before about <a href="https://smallcultfollowing.com/babysteps/blog/2024/03/04/borrow-checking-without-lifetimes/">how I would like to redefine lifetimes in Rust to be places</a> as I feel that a type like <code>ref[list] String</code> is much easier to teach and explain: instead of having to explain that a lifetime references some part of the code, or what have you, you can say that &ldquo;this is a <code>String</code> that references the variable <code>list</code>&rdquo;.</p>
<p>But what&rsquo;s also cool is that named places open the door to more flexible borrows. In Dada, if you wanted to package up the list and the items, you could build a <code>Message</code> type like so:</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give,shared,share,ref" data-dada-types="String,Point,Vec,Map,Character,u32">class Message(
    list: String
    items: Vec[ref[self.list] String]
    //             ---------
    //   Borrowed from another field!
)

// As before:
let list: String = &#34;...something big, with commas...&#34;
let items: Vec[ref[list] String] = list
    .split(&#34;,&#34;)
    .map(_.strip()) // strip whitespace
    .filter(_.len() &gt; 5)
    .collect()

// Create the message, this is the fun part!
let message = Message(list.give, items.give)</code></pre>
<p>Note that last line &ndash; <code>Message(list.give, items.give)</code>. We can create a new class and move <code>list</code> into it <em>along with</em> <code>items</code>, which borrows from list. Neat, right?</p>
<p>OK, so let&rsquo;s back up and talk about how this all works.</p>
<h2 id="references-in-dada-are-the-default">References in Dada are the default</h2>
<p>Let&rsquo;s start with syntax. Before we tackle the <code>Message</code> example, I want to go back to the <code>Character</code> example from previous posts, because it&rsquo;s a bit easier for explanatory purposes. Here is some Rust code that declares a struct <code>Character</code>, creates an owned copy of it, and then gets a few references into it.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Character</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">name</span>: <span class="nb">String</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">class</span>: <span class="nb">String</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">hp</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">ch</span>: <span class="nc">Character</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Character</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">name</span>: <span class="nc">format</span><span class="o">!</span><span class="p">(</span><span class="s">&#34;Ferris&#34;</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">class</span>: <span class="nc">format</span><span class="o">!</span><span class="p">(</span><span class="s">&#34;Rustacean&#34;</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">hp</span>: <span class="mi">22</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">p</span>: <span class="kp">&amp;</span><span class="nc">Character</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">ch</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">q</span>: <span class="kp">&amp;</span><span class="nb">String</span> <span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">p</span><span class="p">.</span><span class="n">name</span><span class="p">;</span><span class="w">
</span></span></span></code></pre></div><p>The Dada equivalent to this code is as follows:</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give,shared,share,ref" data-dada-types="String,Point,Vec,Map,Character,u32">class Character(
    name: String,
    klass: String,
    hp: u32,
)

let ch: Character = Character(&#34;Tzara&#34;, &#34;Dadaist&#34;, 22)
let p: ref[ch] Character = ch
let q: ref[p] String = p.name</code></pre>
<p>The first thing to note is that, in Dada, the <strong>default</strong> when you name a variable or a place is to create a reference. So <code>let p = ch</code> doesn&rsquo;t move <code>ch</code>, as it would in Rust, it creates a reference to the <code>Character</code> stored in <code>ch</code>. You could also explicitly write <code>let p = ch.ref</code>, but that is not preferred. Similarly, <code>let q = p.name</code> creates a reference to the value in the field <code>name</code>. (If you wanted to <em>move</em> the character, you would write <code>let ch2 = ch.give</code>, not <code>let ch2 = ch</code> as in Rust.)</p>
<p>Notice that I said <code>let p = ch</code> &ldquo;creates a reference to the <code>Character</code> stored in <code>ch</code>&rdquo;. In particular, I did <em>not</em> say &ldquo;creates a reference to <code>ch</code>&rdquo;. That&rsquo;s a subtle choice of wording, but it has big implications.</p>
<h2 id="references-in-dada-are-not-pointers">References in Dada are not pointers</h2>
<p>The reason I wrote that <code>let p = ch</code> &ldquo;creates a reference to the <code>Character</code> stored in <code>ch</code>&rdquo; and not &ldquo;creates a reference to <code>ch</code>&rdquo; is because, in Dada, <em>references are not pointers</em>. Rather, they are shallow copies of the value, very much like how we saw in the previous post that a <code>shared Character</code> <em>acts</em> like an <code>Arc&lt;Character&gt;</code> but is represented as a shallow copy.</p>
<p>So where in Rust the following code&hellip;</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">ch</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Character</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">p</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">ch</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">q</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">ch</span><span class="p">.</span><span class="n">name</span><span class="p">;</span><span class="w">
</span></span></span></code></pre></div><p>&hellip;looks like this in memory&hellip;</p>
<pre tabindex="0"><code>        # Rust memory representation

            Stack                       Heap
            ─────                       ────

┌───► ch: Character {
│ ┌───► name: String {
│ │         buffer: ───────────► &#34;Ferris&#34;
│ │         length: 6
│ │         capacity: 12
│ │     },
│ │     ...
│ │   }
│ │   
└──── p
  │
  └── q
</code></pre><p>in Dada, code like this</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give,shared,share,ref" data-dada-types="String,Point,Vec,Map,Character,u32">let ch = Character(...)
let p = ch
let q = ch.name</code></pre>
<p>would look like so</p>
<pre tabindex="0"><code># Dada memory representation

Stack                       Heap
─────                       ────

ch: Character {
    name: String {
            buffer: ───────┬───► &#34;Ferris&#34;
            length: 6      │
            capacity: 12   │
    },                     │
    ..                     │
}                          │
                           │
p: Character {             │
    name: String {         │
            buffer: ───────┤
            length: 6      │
            capacity: 12   │
    ...                    │
}                          │
    }                      │
                           │
q: String {                │
    buffer: ───────────────┘
    length: 6
    capacity: 12
}
</code></pre><p>Clearly, the Dada representation takes up more memory on the stack. But note that it <em>doesn&rsquo;t</em> duplicate the memory in the heap, which tends to be where the vast majority of the data is found.</p>
<h2 id="dada-talks-about-values-not-references">Dada talks about <em>values</em> not <em>references</em></h2>
<p>This gets at something important. Rust, like C, makes pointers first-class. So given <code>x: &amp;String</code>, <code>x</code> refers to <em>the pointer</em> and <code>*x</code> refers to its referent, the <code>String</code>.</p>
<p>Dada, like Java, goes another way. <code>x: ref String</code> <em>is</em> a <code>String</code> value &ndash; including in memory representation! The difference between a <code>given String</code>, <code>shared String</code>, and <code>ref String</code> is not in their memory layout, all of them are the same, but they differ in whether they <strong>own their contents</strong>.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></p>
<p>So in Dada, there is no <code>*x</code> operation to go from &ldquo;pointer&rdquo; to &ldquo;referent&rdquo;. That doesn&rsquo;t make sense. Your variable always contains a string, but the permissions you have to use that string will change.</p>
<p>In fact, the goal is that people <em>don&rsquo;t</em> have to learn the memory representation as they learn Dada, you are supposed to be able to think of Dada variables as if they were all objects on the heap, just like in Java or Python, even though in fact they are stored on the stack.<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></p>
<h2 id="rust-does-not-permit-moves-of-borrowed-data">Rust does not permit moves of borrowed data</h2>
<p>In Rust, you cannot move values while they are borrowed. So if you have code like this that moves <code>ch</code> into <code>ch1</code>&hellip;</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">ch</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Character</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">name</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">ch</span><span class="p">.</span><span class="n">name</span><span class="p">;</span><span class="w"> </span><span class="c1">// create reference
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">ch1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ch</span><span class="p">;</span><span class="w">        </span><span class="c1">// moves `ch`
</span></span></span></code></pre></div><p>&hellip;then this code only compiles if <code>name</code> is not used again:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">ch</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Character</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">name</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">ch</span><span class="p">.</span><span class="n">name</span><span class="p">;</span><span class="w"> </span><span class="c1">// create reference
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">ch1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ch</span><span class="p">;</span><span class="w">        </span><span class="c1">// ERROR: cannot move while borrowed
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">name1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">name</span><span class="p">;</span><span class="w">    </span><span class="c1">// use reference again
</span></span></span></code></pre></div><h2 id="but-dada-can">&hellip;but Dada can</h2>
<p>There are two reasons that Rust forbids moves of borrowed data:</p>
<ul>
<li>References are pointers, so those pointers may become invalidated. In the example above, <code>name</code> points to the stack slot for <code>ch</code>, so if <code>ch</code> were to be moved into <code>ch1</code>, that makes the reference invalid.</li>
<li>The type system would lose track of things. Internally, the Rust borrow checker has a kind of &ldquo;indirection&rdquo;. It knows that <code>ch</code> is borrowed for some span of the code (a &ldquo;lifetime&rdquo;), and it knows that the lifetime in the type of <code>name</code> is related to that lifetime, but it doesn&rsquo;t really know that <code>name</code> is borrowed from <code>ch</code> in particular.<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup></li>
</ul>
<p>Neither of these apply to Dada:</p>
<ul>
<li>Because references are not pointers into the stack, but rather shallow copies, moving the borrowed value doesn&rsquo;t invalidate their contents. They remain valid.</li>
<li>Because Dada&rsquo;s types reference actual variable names, we can modify them to reflect moves.</li>
</ul>
<h2 id="dada-tracks-moves-in-its-types">Dada tracks moves in its types</h2>
<p>OK, let&rsquo;s revisit that Rust example that was giving us an error. When we convert it to Dada, we find that it type checks just fine:</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give,shared,share,ref" data-dada-types="String,Point,Vec,Map,Character,u32">class Character(...) // as before
let ch: given Character = Character(...)
let name: ref[ch.name] String = ch.name
//            -- originally it was borrowed from `ch`
let ch1 = ch.give
//        ------- but `ch` was moved to `ch1`
let name1: ref[ch1.name] = name
//             --- now it is borrowed from `ch1`</code></pre>
<p>Woah, neat! We can see that when we move from <code>ch</code> into <code>ch1</code>, the compiler updates the types of the variables around it. So actually the type of <code>name</code> changes to <code>ref[ch1.name] String</code>. And then when we move from <code>name</code> to <code>name1</code>, that&rsquo;s totally valid.</p>
<p>In PL land, updating the type of a variable from one thing to another is called a &ldquo;strong update&rdquo;. Obviously things can get a bit complicated when control-flow is involved, e.g., in a situation like this:</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give,shared,share,ref" data-dada-types="String,Point,Vec,Map,Character,u32">let ch = Character(...)
let ch1 = Character(...)
let name = ch.name
if some_condition_is_true() {
    // On this path, the type of `name` changes
    // to `ref[ch1.name] String`, and so `ch`
    // is no longer considered borrowed.
    ch1 = ch.give
    ch = Character(...) // not borrowed, we can mutate
} else {
    // On this path, the type of `name`
    // remains unchanged, and `ch` is borrowed.
}
// Here, the types are merged, so the
// type of `name` is `ref[ch.name, ch1.name] String`.
// Therefore, `ch` is considered borrowed here.</code></pre>
<h2 id="renaming-lets-us-call-functions-with-borrowed-values">Renaming lets us call functions with borrowed values</h2>
<p>OK, let&rsquo;s take the next step. Let&rsquo;s define a Dada function that takes an owned value and another value borrowed from it, like the name, and then call it:</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give,shared,share,ref" data-dada-types="String,Point,Vec,Map,Character,u32">fn character_and_name(
    ch1: given Character,
    name1: ref[ch1] String,
) {
    // ... does something ...
}</code></pre>
<p>We could call this function like so, as you might expect:</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give,shared,share,ref" data-dada-types="String,Point,Vec,Map,Character,u32">let ch = Character(...)
let name = ch.name
character_and_name(ch.give, name)</code></pre>
<p>So&hellip;how does this work? Internally, the type checker type-checks a function call by creating a simpler snippet of code, essentially, and then type-checking <em>that</em>. It&rsquo;s like desugaring but only at type-check time. In this simpler snippet, there are a series of <code>let</code> statements to create temporary variables for each argument. These temporaries always have an explicit type taken from the method signature, and they are initialized with the values of each argument:</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give,shared,share,ref" data-dada-types="String,Point,Vec,Map,Character,u32">// type checker &#34;desugars&#34; `character_and_name(ch.give, name)`
// into more primitive operations:
let tmp1: given Character = ch.give
    //    ---------------   -------
    //            |         taken from the call
    //    taken from fn sig
let tmp2: ref[tmp1.name] String = name
    //    ---------------------   ----
    //            |         taken from the call
    //    taken from fn sig,
    //    but rewritten to use the new
    //    temporaries</code></pre>
<p>If this type checks, then the type checker knows you have supplied values of the required types, and so this is a valid call. Of course there are a few more steps, but that&rsquo;s the basic idea.</p>
<p>Notice what happens if you supply data borrowed from the wrong place:</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give,shared,share,ref" data-dada-types="String,Point,Vec,Map,Character,u32">let ch = Character(...)
let ch1 = Character(...)
character_and_name(ch, ch1.name)
//                     --- wrong place!</code></pre>
<p>This will fail to type check because you get:</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give,shared,share,ref" data-dada-types="String,Point,Vec,Map,Character,u32">let tmp1: given Character = ch.give
let tmp2: ref[tmp1.name] String = ch1.name
    //                            --------
    //       has type `ref[ch1.name] String`,
    //       not `ref[tmp1.name] String`</code></pre>
<h2 id="class-constructors-are-just-special-functions">Class constructors are &ldquo;just&rdquo; special functions</h2>
<p>So now, if we go all the way back to our original example, we can see how the <code>Message</code> example worked:</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give,shared,share,ref" data-dada-types="String,Point,Vec,Map,Character,u32">class Message(
    list: String
    items: Vec[ref[self.list] String]
)</code></pre>
<p>Basically, when you construct a <code>Message(list, items)</code>, that&rsquo;s &ldquo;just another function call&rdquo; from the type system&rsquo;s perspective, except that <code>self</code> in the signature is handled carefully.</p>
<h2 id="this-is-modeled-not-implemented">This is modeled, not implemented</h2>
<p>I should be clear, this system is modeled in the <a href="https://github.com/dada-lang/dada-model/">dada-model</a> repository, which implements a kind of &ldquo;mini Dada&rdquo; that captures what I believe to be the most interesting bits. I&rsquo;m working on fleshing out that model a bit more, but it&rsquo;s got most of what I showed you here.<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup> For example, <a href="https://github.com/dada-lang/dada-model/blob/b6833b57af8f0b293755410760c240b75fbf4998/src/type_system/tests/new_with_self_references.rs#L61-L99">here is a test</a> that you get an error when you give a reference to the wrong value.</p>
<p>The &ldquo;real implementation&rdquo; is lagging quite a bit, and doesn&rsquo;t really handle the interesting bits yet. Scaling it up from model to real implementation involves solving type inference and some other thorny challenges, and I haven&rsquo;t gotten there yet &ndash; though I have some pretty interesting experiments going on there too, in terms of the compiler architecture.<sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup></p>
<h2 id="this-could-apply-to-rust">This could apply to Rust</h2>
<p>I believe we could apply most of this system to Rust. Obviously we&rsquo;d have to rework the borrow checker to be based on places, but that&rsquo;s the straight-forward part. The harder bit is the fact that <code>&amp;T</code> is a pointer in Rust, and that we cannot readily change. However, for many use cases of self-references, this isn&rsquo;t as important as it sounds. Often, the data you wish to reference is living in the heap, and so the pointer isn&rsquo;t actually invalidated when the original value is moved.</p>
<p>Consider our opening example. You might imagine Rust allowing something like this in Rust:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Message</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">list</span>: <span class="nb">String</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">items</span>: <span class="nb">Vec</span><span class="o">&lt;&amp;</span><span class="p">{</span><span class="bp">self</span><span class="p">.</span><span class="n">list</span><span class="p">}</span><span class="w"> </span><span class="kt">str</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>In this case, the <code>str</code> data is heap-allocated, so moving the string doesn&rsquo;t actually invalidate the <code>&amp;str</code> value (it <em>would</em> invalidate an <code>&amp;String</code> value, interestingly).</p>
<p>In Rust today, the compiler doesn&rsquo;t know all the details of what&rsquo;s going on. <code>String</code> has a <code>Deref</code> impl and so it&rsquo;s quite opaque whether <code>str</code> is heap-allocated or not. But we are working on various changes to this system in the <a href="https://rust-lang.github.io/rust-project-goals/2026/roadmap-beyond-the-ampersand.html">Beyond the <code>&amp;</code></a> goal, most notably the <a href="https://rust-lang.github.io/rust-project-goals/2026/field-projections.html">Field Projections</a> work. There is likely some opportunity to address this in that context, though to be honest I&rsquo;m behind in catching up on the details.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>I&rsquo;ll note in passing that Dada unifies <code>str</code> and <code>String</code> into one type as well. I&rsquo;ll talk in detail about how that works in a future blog post.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>This is <em>kind</em> of like C++ references (e.g., <code>String&amp;</code>), which also act &ldquo;as if&rdquo; they were a value (i.e., you write <code>s.foo()</code>, not <code>s-&gt;foo()</code>), but a C++ reference is truly a pointer, unlike a Dada ref.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>This goal was in part inspired by a conversation I had early on within Amazon, where a (quite experienced) developer told me, &ldquo;It took me months to understand what variables are in Rust&rdquo;.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>I explained this some years back in a <a href="https://www.youtube.com/watch?v=_agDeiWek8w">talk on Polonius at Rust Belt Rust</a>, if you&rsquo;d like more detail.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>No closures or iterator chains!&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p>As a teaser, I&rsquo;m building it in async Rust, where each inference variable is a &ldquo;future&rdquo; and use &ldquo;await&rdquo; to find out when other parts of the code might have added constraints.&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/dada" term="dada" label="Dada"/></entry><entry><title type="html">What it means that Ubuntu is using Rust</title><link href="https://smallcultfollowing.com/babysteps/blog/2026/02/23/ubuntu-rustnation/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2026/02/23/ubuntu-rustnation/</id><published>2026-02-23T00:00:00+00:00</published><updated>2026-02-23T08:14:05-05:00</updated><content type="html"><![CDATA[<p>Righty-ho, I&rsquo;m back from Rust Nation, and busily horrifying my teenage daughter with my (admittedly atrocious) attempts at doing an English accent<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>. It was a great trip with a lot of good conversations and some interesting observations. I am going to try to blog about some of them, starting with some thoughts spurred by Jon Seager&rsquo;s closing keynote, &ldquo;Rust Adoption At Scale with Ubuntu&rdquo;.</p>
<h2 id="there-are-many-chasms-out-there">There are many chasms out there</h2>
<p>For some time now I&rsquo;ve been debating with myself, has Rust <a href="https://en.wikipedia.org/wiki/Crossing_the_Chasm">&ldquo;crossed the chasm&rdquo;</a>? If you&rsquo;re not familiar with that term, it comes from a book that gives a kind of &ldquo;pop-sci&rdquo; introduction to the <a href="https://en.wikipedia.org/wiki/Technology_adoption_life_cycle">Technology Adoption Life Cycle</a>.</p>
<p>The answer, of course, is <em>it depends on who you ask</em>. Within Amazon, where I have the closest view, the answer is that we are &ldquo;most of the way across&rdquo;: Rust is squarely established as the right way to build at-scale data planes or resource-aware agents and it is increasingly seen as the right choice for low-level code in devices and robotics as well &ndash; but there remains a lingering perception that Rust is useful for &ldquo;those fancy pants developers at S3&rdquo; (or wherever) but a bit overkill for more average development<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>.</p>
<p>On the other hand, within the realm of Safety Critical Software, as Pete LeVasseur wrote in a <a href="https://blog.rust-lang.org/2026/01/14/what-does-it-take-to-ship-rust-in-safety-critical/">recent rust-lang blog post</a>, Rust is still scrabbling for a foothold. There are a number of successful products but most of the industry is in a &ldquo;wait and see&rdquo; mode, letting the early adopters pave the path.</p>
<h2 id="crossing-the-chasm-means-finding-reference-customers">&ldquo;Crossing the chasm&rdquo; means finding &ldquo;reference customers&rdquo;</h2>
<p>The big idea that I at least took away from reading <a href="https://en.wikipedia.org/wiki/Crossing_the_Chasm">Crossing the Chasm</a> and other references on the <a href="https://en.wikipedia.org/wiki/Technology_adoption_life_cycle">technology adoption life cycle</a> is the need for &ldquo;reference customers&rdquo;. When you first start out with something new, you are looking for pioneers and early adopters that are drawn to new things:</p>
<blockquote>
<p>What an early adopter is buying [..] is some kind of <em>change agent</em>. By being the first to implement this change in the industry, the early adopters expect to get a jump on the competition. &ndash; from <em>Crossing the Chasm</em></p>
</blockquote>
<p>But as your technology matures, you have to convince people with a lower and lower tolerance for risk:</p>
<blockquote>
<p>The early majority want to buy a <em>productivity improvement</em> for existing operations. They are looking to minimize discontinuity with the old ways. They want evolution, not revolution. &ndash; from <em>Crossing the Chasm</em></p>
</blockquote>
<p>So what is <em>most convincing</em> to people to try something new? The answer is seeing that others like them have succeeded.</p>
<p>You can see this at play in both the Amazon example and the Safety Critical Software example. Clearly seeing Rust used for network services doesn&rsquo;t mean it&rsquo;s ready to be used in your car&rsquo;s steering column<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup>. And even within network services, seeing a group like S3 succeed with Rust may convince other groups building at-scale services to try Rust, but doesn&rsquo;t necessarily persuade a team to use Rust for their next CRUD service. And frankly, it shouldn&rsquo;t! They are likely to hit obstacles.</p>
<h2 id="ubuntu-is-helping-rust-cross-the-user-land-linux-chasm">Ubuntu is helping Rust &ldquo;cross the (user-land linux) chasm&rdquo;</h2>
<p>All of this was on my mind as I watched the keynote by Jon Seager, the VP of Engineering at Canonical, which is the company behind Ubuntu. Similar to Lars Bergstrom&rsquo;s <a href="https://www.youtube.com/watch?v=QrrH2lcl9ew">epic keynote from year&rsquo;s past</a> on Rust adoption within Google, Jon laid out a pitch for why Canonical is adopting Rust that was at once <strong>visionary</strong> and yet <strong>deeply practical</strong>.</p>
<p>&ldquo;Visionary and yet deeply practical&rdquo; is pretty much the textbook description of what we need to cross from <em>early adopters</em> to <em>early majority</em>. We need folks who care first and foremost about delivering the right results, but are open to new ideas that might help them do that better; folks who can stand on both sides of the chasm at once.</p>
<p>Jon described how Canonical focuses their own development on a small set of languages: Python, C/C++, and Go, and how they had recently brought in Rust and were using it as the language of choice for new <a href="https://smallcultfollowing.com/babysteps/blog/2025/03/10/rust-2025-intro/">foundational efforts</a>, replacing C, C++, and (some uses of) Python.</p>
<h2 id="ubuntu-is-building-the-bridge-across-the-chasm">Ubuntu is building the bridge across the chasm</h2>
<p>Jon talked about how he sees it as part of Ubuntu&rsquo;s job to &ldquo;pay it forward&rdquo; by supporting the construction of memory-safe foundational utilities. Jon meant support both in terms of finances &ndash; Canonical is sponsoring the <a href="https://trifectatech.org/">Trifecta Tech Foundation&rsquo;s</a> to develop <a href="https://github.com/trifectatechfoundation/sudo-rs">sudo-rs</a> and <a href="https://github.com/pendulum-project/ntpd-rs">ntpd-rs</a> and sponsoring the <a href="https://github.com/uutils/">uutils org&rsquo;s</a> work on <a href="https://uutils.github.io/coreutils/">coreutils</a> &ndash; and in terms of reputation. Ubuntu can take on the risk of doing something new, prove that it works, and then let others benefit.</p>
<p>Remember how the Crossing the Chasm book described early majority people? They are &ldquo;looking to minimize discontinuity with the old ways&rdquo;. And what better way to do that than to have drop-in utilities that fit within their existing workflows.</p>
<h2 id="the-challenge-for-rust-listening-to-these-new-adopters">The challenge for Rust: listening to these new adopters</h2>
<p>With new adoption comes new perspectives. On Thursday night I was at dinner<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup> organized by Ernest Kissiedu<sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup>. Jon Seager was there along with some other Rust adopters from various industries, as were a few others from the Rust Foundation and the open-source project.</p>
<p>Ernest asked them to give us their unvarnished takes on Rust. Jon made the provocative comment that we needed to revisit our policy around having a small standard library. He&rsquo;s not the first to say something like that, it&rsquo;s something we&rsquo;ve been hearing for years and years &ndash; and I think he&rsquo;s right! Though I don&rsquo;t think the answer is just to ship a big standard library. In fact, it&rsquo;s kind of a perfect lead-in to (what I hope will be) my next blog post, which is about a project I call &ldquo;battery packs&rdquo;<sup id="fnref:7"><a href="#fn:7" class="footnote-ref" role="doc-noteref">7</a></sup>.</p>
<h2 id="to-grow-you-have-to-change">To grow, you have to change</h2>
<p>The broader point though is that shifting from targeting &ldquo;pioneers&rdquo; and &ldquo;early adopters&rdquo; to targeting &ldquo;early majority&rdquo; sometimes involves some uncomfortable changes:</p>
<blockquote>
<p>Transition between any two adoption segments is normally excruciatingly awkward because you must adopt new strategies just at the time you have become most comfortable with the old ones. [..] The situation can be further complicated if the high-tech company, fresh from its marketing success with visionaries, neglects to change its sales pitch. [..] <strong>The company may be saying &ldquo;state-of-the-art&rdquo; when the pragmatist wants to hear &ldquo;industry standard&rdquo;.</strong> &ndash; Crossing the Chasm (emphasis mine)</p>
</blockquote>
<p>Not everybody will remember it, but in 2016 there was a proposal called <a href="https://internals.rust-lang.org/t/proposal-the-rust-platform/3745">the Rust Platform</a>. The idea was to bring in some crates and bless them as a kind of &ldquo;extended standard library&rdquo;. People <em>hated</em> it. After all, they said, why not just add dependencies to your <code>Cargo.toml</code>? It&rsquo;s easy enough. And to be honest, they were right &ndash; at least at the time.</p>
<p>I think the Rust Platform is a good example of something that was a poor fit for early adopters, who want the newest thing and don&rsquo;t mind finding the best crates, but which could be a <em>great</em> fit for the Early Majority.<sup id="fnref:8"><a href="#fn:8" class="footnote-ref" role="doc-noteref">8</a></sup></p>
<p>Anyway, I&rsquo;m not here to argue for one thing or another in this post, but more for the concept that we have to be open to adapting our learned wisdom to new circumstances. In the past, we were trying to bootstrap Rust into the industry&rsquo;s consciousness &ndash; and we have succeeded.</p>
<p>The task before us now is different: <strong>we need to make Rust the best option not just in terms of &ldquo;what it <em>could be</em>&rdquo; but in terms of &ldquo;what it <em>actually is</em>&rdquo;</strong> &ndash; and sometimes those are in tension.</p>
<h2 id="another-challenge-for-rust-turning-adoption-into-investment">Another challenge for Rust: turning adoption into investment</h2>
<p>Later in the dinner, the talk turned, as it often does, to money. Growing Rust adoption also comes with growing needs placed on the Rust project and its ecosystem. How can we connect the dots? This has been a big item on my mind, and I realize in writing this paragraph how many blog posts I have yet to write on the topic, but let me lay out a few interesting points that came up over this dinner and at other recent points.</p>
<h2 id="investment-can-mean-contribution-particularly-for-open-source-orgs">Investment can mean contribution, particularly for open-source orgs</h2>
<p>First, there are more ways to offer support than $$. For Canonical specifically, as they are an open-source organization through-and-through, what I would most want is to build stronger relationships between our organizations. With the Rust for Linux developers, early on Rust maintainers were prioritizing and fixing bugs on behalf of RfL devs, but more and more, RfL devs are fixing things themselves, with Rust maintainers serving as mentors. This is awesome!</p>
<h2 id="money-often-comes-before-a-company-has-adopted-rust-not-after">Money often comes <em>before</em> a company has adopted Rust, not after</h2>
<p>Second, there&rsquo;s an interesting trend about $$ that I&rsquo;ve seen crop up in a few places. We often think of companies investing in the open-source dependencies that they rely upon. But there&rsquo;s an entirely different source of funding, and one that might be even easier to tap, which is to look at companies that are <strong>considering</strong> Rust but haven&rsquo;t adopted it yet.</p>
<p>For those &ldquo;would be&rdquo; adopters, there are often <em>individuals</em> in the org who are trying to make the case for Rust adoption &ndash; these individuals are early adopters, people with a vision for how things could be, but they are trying to sell to their early majority company. And to do that, they often have a list of &ldquo;table stakes&rdquo; features that need to be supported; what&rsquo;s more, they often have access to some budget to make these things happen.</p>
<p>This came up when I was talking to Alexandru Radovici, the Foundation&rsquo;s Silver Member Directory, who said that many safety critical companies have money they&rsquo;d like to spend to close various gaps in Rust, but they don&rsquo;t know how to spend it. Jon&rsquo;s investments in Trifecta Tech and the uutils org have the same character: he is looking to close the gaps that block Ubuntu from using Rust more.</p>
<h2 id="conclusions">Conclusions&hellip;?</h2>
<p>Well, first of all, you should watch Jon&rsquo;s talk. &ldquo;Brilliant&rdquo;, as the Brits have it.</p>
<p>But my other big thought is that this is a crucial time for Rust. We are clearly transitioning in a number of areas from visionaries and early adopters towards that pragmatic majority, and we need to be mindful that doing so may require us to change some of the way that we&rsquo;ve always done things. I liked this paragraph from <a href="https://en.wikipedia.org/wiki/Crossing_the_Chasm">Crossing the Chasm</a>:</p>
<blockquote>
<p>To market successfully to pragmatists, one does not have to be one &ndash; just understand their values and work to serve them. To look more closely into these values, if the goal of visionaries is to take a quantum leap forward, the goal of pragmatists is to make a percentage improvement&ndash;incremental, measurable, predictable progress. [..] To market to pragmatists, you must be patient. You need to be conversant with the issues that dominate their particular business. You need to show up at the industry-specific conferences and trade shows they attend.</p>
</blockquote>
<p>Re-reading <a href="https://en.wikipedia.org/wiki/Crossing_the_Chasm">Crossing the Chasm</a> as part of writing this blog post has really helped me square where Rust is &ndash; for the most part, I think we are still crossing the chasm, but we are well on our way. I think what we see is a consistent trend now where we have Rust <em>champions</em> who fit the &ldquo;visionary&rdquo; profile of early adopters successfully advocating for Rust within companies that fit the pragmatist, early majority profile.</p>
<h3 id="open-source-can-be-a-great-enabler-to-cross-the-chasm">Open source can be a great enabler to cross the chasm&hellip;</h3>
<p>It strikes me that open-source is just an amazing platform for doing this kind of marketing. Unlike a company, we don&rsquo;t have to do everything ourselves. We have to leverage the fact that <em>open source helps those who help themselves</em> &ndash; find those visionary folks in industries that could really benefit from Rust, bring them into the Rust orbit, and then (most important!) <strong>support and empower them</strong> to adapt Rust to their needs.</p>
<h3 id="but-only-if-we-dont-get-too-middle-school-about-it">&hellip;but only if we don&rsquo;t get too &ldquo;middle school&rdquo; about it</h3>
<p>This last part may sound obvious, but it&rsquo;s harder than it sounds. When you&rsquo;re embedded in open source, it seems like a friendly place where everyone is welcome. But the reality is that it can be a place full of cliques and &ldquo;oral traditions&rdquo; that &ldquo;everybody knows&rdquo;<sup id="fnref:9"><a href="#fn:9" class="footnote-ref" role="doc-noteref">9</a></sup>. People coming with an idea can get shutdown for using the wrong word. They can readily mistake the, um, &ldquo;impassioned&rdquo; comments from a random contributor (or perhaps just a troll&hellip;) for the official word from project leadership. It only takes one rude response to turn somebody away.</p>
<h3 id="what-rust-needs-most-is-empathy">What Rust needs most is empathy</h3>
<p>So what will ultimately help Rust the most to succeed? <a href="https://smallcultfollowing.com/babysteps/blog/2023/09/27/empathy-in-open-source/">Empathy in Open Source</a>. Let&rsquo;s get out there, find out where Rust can help people, and make it happen. Exciting times!</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>I am famously bad at accents. My best attempt at posh British sounds more like Apu from the Simpsons. I really wish I could pull off a convincing Greek accent, but sadly no.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Another of my pearls of wisdom is &ldquo;there is nothing more permanent than temporary code&rdquo;. I used to say that back at the startup I worked at after college, but years of experience have only proven it more and more true.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Russel Cohen and Jess Izen gave a <a href="https://www.youtube.com/watch?v=VthhIdqwdHc">great talk at last year&rsquo;s RustConf</a> about what our team is doing to help teams decide if Rust is viable for them. But since then another thing having a big impact is AI, which is bringing previously unthinkable projects, like rewriting older systems, within reach.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>I have no idea if there is code in a car&rsquo;s steering column, for the record. I assume so by now? For power steering or some shit?&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>Or am I supposed to call it &ldquo;tea&rdquo;? Or maybe &ldquo;supper&rdquo;? I can&rsquo;t get a handle on British mealtimes.&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p>Ernest is such a joy to be around. He&rsquo;s quiet, but he&rsquo;s got a lot of insights if you can convince him to share them. If you get the chance to meet him, take it! If you live in London, go to the London Rust meetup! Find Ernest and introduce yourself. Tell him Niko sent you and that you are supposed to say how great he is and how you want to learn from the wisdom he&rsquo;s accrued over the years. Then watch him blush. What a doll.&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:7">
<p>If you can&rsquo;t wait, you can read some <a href="https://rust-lang.zulipchat.com/#narrow/channel/220302-wg-cli/topic/Hello.20everyone/near/570148087">Zulip discussion</a> here.&#160;<a href="#fnref:7" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:8">
<p>The <a href="https://rust-lang.zulipchat.com/#narrow/channel/220302-wg-cli/topic/Hello.20everyone/near/570148087">Battery Packs proposal</a> I want to talk about is similar in some ways to the Rust Platform, but decentralized and generally better in my opinion&ndash; but I get ahead of myself!&#160;<a href="#fnref:8" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:9">
<p><a href="https://en.wikipedia.org/wiki/Betteridge%27s_law_of_headlines">Betteridge&rsquo;s Law of Headlines</a> has it that &ldquo;Any headline that ends in a question mark can be answered by the word <em>no</em>&rdquo;. Well, Niko&rsquo;s law of open-source<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> is that &ldquo;nobody actually knows anything that &rsquo;everybody&rsquo; knows&rdquo;.&#160;<a href="#fnref:9" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>]]></content></entry><entry><title type="html">Sharing in Dada</title><link href="https://smallcultfollowing.com/babysteps/blog/2026/02/14/sharing-in-dada/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2026/02/14/sharing-in-dada/</id><published>2026-02-14T00:00:00+00:00</published><updated>2026-02-14T06:49:35-05:00</updated><content type="html"><![CDATA[<p>OK, let&rsquo;s talk about <em>sharing</em>. This is the first of Dada blog posts where things start to diverge from Rust in a deep way and I think the first where we start to see some real advantages to the Dada way of doing things (and some of the tradeoffs I made to achieve those advantages).</p>
<h2 id="we-are-shooting-for-a-gc-like-experience-without-gc">We are shooting for a GC-like experience without GC</h2>
<p>Let&rsquo;s start with the goal: earlier, I said that Dada was like &ldquo;Rust where you never have to type <code>as_ref</code>&rdquo;. But what I really meant is that I want a <em>GC-like</em> experience&ndash;without the GC.</p>
<h2 id="we-are-shooting-for-a-composable-experience">We are shooting for a &ldquo;composable&rdquo; experience</h2>
<p>I also often use the word &ldquo;composable&rdquo; to describe the Dada experience I am shooting for. <em>Composable</em> means that you can take different things and put them together to achieve something new.</p>
<p>Obviously Rust has many composable patterns &ndash; the <code>Iterator</code> APIs, for example. But what I have found is that Rust code is often very brittle: there are many choices when it comes to how you declare your data structures and the choices you make will inform how those data structures can be consumed.</p>
<h2 id="running-example-character">Running example: <code>Character</code></h2>
<h3 id="defining-the-character-type">Defining the <code>Character</code> type</h3>
<p>Let&rsquo;s create a type that we can use as a running example throughout the post: <code>Character</code>. In Rust, we might define a <code>Character</code> like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cp">#[derive(Default)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Character</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">name</span>: <span class="nb">String</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">class</span>: <span class="nb">String</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">hp</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="creating-and-arcing-the-character">Creating and Arc&rsquo;ing the <code>Character</code></h3>
<p>Now, suppose that, for whatever reason, we are going to build up a character programmatically:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">ch</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Character</span>::<span class="n">default</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">ch</span><span class="p">.</span><span class="n">name</span><span class="p">.</span><span class="n">push_str</span><span class="p">(</span><span class="s">&#34;Ferris&#34;</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">ch</span><span class="p">.</span><span class="n">class</span><span class="p">.</span><span class="n">push_str</span><span class="p">(</span><span class="s">&#34;Rustacean&#34;</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">ch</span><span class="p">.</span><span class="n">hp</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">44</span><span class="p">;</span><span class="w">
</span></span></span></code></pre></div><p>So far, so good. Now suppose I want to share that same <code>Character</code> struct so it can be referenced from a lot of places without deep copying. To do that, I am going to put it in an <code>Arc</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">ch</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Character</span>::<span class="n">default</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">ch</span><span class="p">.</span><span class="n">name</span><span class="p">.</span><span class="n">push_str</span><span class="p">(</span><span class="s">&#34;Ferris&#34;</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// ...
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">ch1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Arc</span>::<span class="n">new</span><span class="p">(</span><span class="n">ch</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">ch2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ch1</span><span class="p">.</span><span class="n">clone</span><span class="p">();</span><span class="w">
</span></span></span></code></pre></div><p>OK, cool! Now I have a <code>Character</code> that is readily sharable. That&rsquo;s great.</p>
<h3 id="rust-is-composable-here-which-is-cool-we-like-that">Rust is composable here, which is cool, we like that</h3>
<p>Side note but this is an example of where Rust <em>is</em> composable: we defined <code>Character</code> once in a fully-owned way and we were able to use it mutably (to build it up imperatively over time) and then able to &ldquo;freeze&rdquo; it and get a read-only, shared copy of <code>Character</code>. This gives us the advantages of an imperative programming language (easy data construction and manipulation) and the advantages of a functional language (immutability prevents bugs when things are referenced from many disjoint places). Nice!</p>
<h3 id="creating-and-arcing-the-character-1">Creating and Arc&rsquo;ing the <code>Character</code></h3>
<p><em>Now</em>, suppose that I have some other code, written independently, that <em>just</em> needs to store the character&rsquo;s <em>name</em>. That code winds up copying the name into a lot of different places. So, just like we used <code>Arc</code> to let us cheaply reference a single character from multiple places, it uses <code>Arc</code> so it can cheaply reference the character&rsquo;s <em>name</em> from multiple places:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">CharacterSheetWidget</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Use `Arc&lt;String&gt;` and not `String` because
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// we wind up copying this into name different
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// places and we don&#39;t want to deep clone
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// the string each time.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">name</span>: <span class="nc">Arc</span><span class="o">&lt;</span><span class="nb">String</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// ... assume more fields here ...
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>OK. Now comes the rub. I want to create a character-sheet widget from our shared character:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">create_character_sheet_widget</span><span class="p">(</span><span class="n">ch</span>: <span class="nc">Arc</span><span class="o">&lt;</span><span class="n">Character</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">CharacterSheetWidget</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">CharacterSheetWidget</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// FIXME: Huh, how do I bridge this gap?
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// I guess I have to do this.
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">name</span>: <span class="nc">Arc</span>::<span class="n">new</span><span class="p">(</span><span class="n">ch</span><span class="p">.</span><span class="n">name</span><span class="p">.</span><span class="n">clone</span><span class="p">()),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// ... assume more fields here ...
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Shoot, that&rsquo;s frustrating! What I would <em>like</em> to do is to write <code>name: ch.name.clone()</code> or something similar (actually I&rsquo;d probably <em>like</em> to just write <code>ch.name</code>, but anyhow) and get back an <code>Arc&lt;String&gt;</code>. But I can&rsquo;t do that. Instead, I have to deeply clone the string <em>and</em> allocate a <em>new</em> <code>Arc</code>. Of course any subsequent clones will be cheap. But it&rsquo;s not great.</p>
<h3 id="rust-often-gives-rise-to-these-kind-of-impedance-mismatches">Rust often gives rise to these kind of &ldquo;impedance mismatches&rdquo;</h3>
<p>I often find patterns like this arise in Rust: there&rsquo;s a bit of an &ldquo;impedance mismatch&rdquo; between one piece of code and another. The <em>solution</em> varies, but it&rsquo;s generally something like</p>
<ul>
<li><em>clone some data</em> &ndash; it&rsquo;s not so big anyway, screw it (that&rsquo;s what happened here).</li>
<li><em>refactor one piece of code</em> &ndash; e.g., modify the <code>Character</code> class to store an <code>Arc&lt;String&gt;</code>. Of course, that has ripple effects, e.g., we can no longer write <code>ch.name.push_str(...)</code> anymore, but have to use <code>Arc::get_mut</code> or something.</li>
<li><em>invoke some annoying helper</em> &ndash; e.g., write <code>opt.as_ref()</code> to convert from an <code>&amp;Option&lt;String&gt;</code> to a <code>Option&lt;&amp;String&gt;</code> or write a <code>&amp;**r</code> to convert from a <code>&amp;Arc&lt;String&gt;</code> to a <code>&amp;str</code>.</li>
</ul>
<p>The goal with Dada is that we don&rsquo;t have that kind of thing.</p>
<h2 id="sharing-is-how-dada-copies">Sharing is how Dada copies</h2>
<p>So let&rsquo;s walk through how that same <code>Character</code> example would play out in Dada. We&rsquo;ll start by defining the <code>Character</code> class:</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give,shared,share" data-dada-types="String,Point,Vec,Map,Character,u32">class Character(
    name: String,
    klass: String,  # Oh dang, the perils of a class keyword!
    hp: u32,
)</code></pre>
<p>Just as in Rust, we can create the character and then modify it afterwards:</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give,shared,share" data-dada-types="String,Point,Vec,Map,Character,u32">class Character(name: String, klass: String, hp: u32)

let ch: given Character = Character(&#34;&#34;, &#34;&#34;, 22)
      # ----- remember, the &#34;given&#34; permission
      #       means that `ch` is fully owned
ch.name!.push(&#34;Tzara&#34;)
ch.klass!.push(&#34;Dadaist&#34;)
   #    - and the `!` signals mutation</code></pre>
<h2 id="the-share-operator-creates-a-shared-object">The <code>.share</code> operator creates a <code>shared</code> object</h2>
<p>Cool. Now, I want to share the character so it can be referenced from many places. In Rust, we created an <code>Arc</code>, but in Dada, sharing is &ldquo;built-in&rdquo;. We use the <code>.share</code> operator, which will convert the <code>given Character</code> (i.e., fully owned character) into a <code>shared Character</code>:</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give,shared,share" data-dada-types="String,Point,Vec,Map,Character,u32">class Character(name: String, klass: String, hp: u32)

let ch = Character(&#34;&#34;, &#34;&#34;, 22)
ch!.push(&#34;Tzara&#34;)
ch!.push(&#34;Dadaist&#34;)

let ch1: shared Character = ch.share
      #  ------                -----
      # The `share` operator consumes `ch`
      # and returns the same object, but now
      # with *shared* permissions.</code></pre>
<h2 id="shared-objects-can-be-copied-freely"><code>shared</code> objects can be copied freely</h2>
<p>Now that we have a <code>shared</code> character, we can copy it around:</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give,shared,share" data-dada-types="String,Point,Vec,Map,Character,u32">class Character(name: String, klass: String, hp: u32)

# Create a shared character to start
let ch1 = Character(&#34;Tzara&#34;, &#34;Dadaist&#34;, 22).share
    #                                       -----

# Create another shared character
let ch2 = ch1</code></pre>
<h2 id="sharing-propagates-from-owner-to-field">Sharing propagates from owner to field</h2>
<p>When you have a shared object and you access its field, what you get back is a <strong>shared (shallow) copy of the field</strong>:</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give,shared,share" data-dada-types="String,Point,Vec,Map,Character,u32">class Character(...)

# Create a `shared Character`
let ch: shared Character = Character(&#34;Tristan Tzara&#34;, &#34;Dadaist&#34;, 22).share
      # ------                                                       -----

# Extracting the `name` field gives a `shared String`
let name: shared String = ch1.name
        # ------</code></pre>
<h2 id="propagation-using-a-vec">Propagation using a <code>Vec</code></h2>
<p>To drill home how cool and convenient this is, imagine that I have a <code>Vec[String]</code> that I share with <code>.share</code>:</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give,shared,share" data-dada-types="String,Point,Vec,Map,Character,u32">let v: shared Vec[String] = [&#34;Hello&#34;, &#34;Dada&#34;].share</code></pre>
<p>and then I share it with <code>v.share</code>. What I get back is a <code>shared Vec[String]</code>. And when I access the elements of that, I get back a <code>shared String</code>:</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give,shared,share" data-dada-types="String,Point,Vec,Map,Character,u32">let v = [&#34;Hello&#34;, &#34;Dada&#34;].share
let s: shared String = v[0]</code></pre>
<p>This is as if one could take a <code>Arc&lt;Vec&lt;String&gt;&gt;</code> in Rust and get out a <code>Arc&lt;String&gt;</code>.</p>
<h2 id="how-sharing-is-implemented">How sharing is implemented</h2>
<p>So how is sharing implemented? The answer lies in a not-entirely-obvious memory layout. To see how it works, let&rsquo;s walk how a <code>Character</code> would be laid out in memory:</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give,shared,share" data-dada-types="String,Point,Vec,Map,Character,u32"># Character type we saw earlier.
class Character(name: String, klass: String, hp: u32)

# String type would be something like this.
class String {
    buffer: Pointer[char]
    initialized: usize
    length: usize
}</code></pre>
<p>Here <code>Pointer</code> is a built-in type that is the basis for Dada&rsquo;s unsafe code system.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<h3 id="layout-of-a-given-character-in-memory">Layout of a <code>given Character</code> in memory</h3>
<p>Now imagine we have a <code>Character</code> like this:</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give,shared,share" data-dada-types="String,Point,Vec,Map,Character,u32">let ch = Character(&#34;Duchamp&#34;, &#34;Dadaist&#34;, 22)</code></pre>
<p>The character <code>ch</code> would be laid out in memory something like this (focusing just on the <code>name</code> field):</p>
<pre tabindex="0"><code>[Stack frame]              [Heap]         
ch: Character {                           
    _flag: 1                              
    name: String {                        
        _flag: 1         { _ref_count: 1  
        buffer: ──────────►&#39;D&#39;            
        initialized: 7     ...            
        capacity: 8        &#39;p&#39; }          
    }                                     
    klass: ...                            
    hp: 22                                
}                                         
</code></pre><p>Let&rsquo;s talk this through. First, every object is laid out flat in memory, just like you would see in Rust. So the fields of <code>ch</code> are stored on the stack, and the <code>name</code> field is laid out flat within that.</p>
<p>Each object that owns other objects begins with a hidden field, <code>_flag</code>. This field indicates whether the object is shared or not (in the future we&rsquo;ll add more values to account for other permissions). If the field is 1, the object is not shared. If it is 2, then it is shared.</p>
<p>Heap-allocated objects (i.e., using <code>Pointer[]</code>) begin with a ref-count before the actual data (actually this is at the offset of -4). In this case we have a <code>Pointer[char]</code> so the actual data that follows are just simple characters.</p>
<h3 id="layout-of-a-shared-character-in-memory">Layout of a <code>shared Character</code> in memory</h3>
<p>If I were to instead create a <em>shared</em> character:</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give,shared,share" data-dada-types="String,Point,Vec,Map,Character,u32">let ch1 = Character(&#34;Duchamp&#34;, &#34;Dadaist&#34;, 22).share
          #                                   -----</code></pre>
<p>The memory layout would be the same, but the flag field on the character is now 2:</p>
<pre tabindex="0"><code>[Stack frame]              [Heap]         
ch: Character {                           
    _flag: 2 👈 (This is 2 now!)                             
    name: String {                        
        _flag: 1         { _ref_count: 1  
        buffer: ──────────►&#39;D&#39;            
        initialized: 7     ...            
        capacity: 8        &#39;p&#39; }          
    }                                     
    klass: ...                            
    hp: 22                                
}                                         
</code></pre><h3 id="copying-a-shared-character">Copying a <code>shared Character</code></h3>
<p>Now imagine that we created two copies of the same shared character:</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give,shared,share" data-dada-types="String,Point,Vec,Map,Character,u32">let ch1 = Character(&#34;Duchamp&#34;, &#34;Dadaist&#34;, 22).share
let ch2 = ch1</code></pre>
<p>What happens is that we will copy all the fields of <code>_ch1</code> and then, because <code>_flag</code> is 2, we will increment the ref-counts for the heap-allocated data within:</p>
<pre tabindex="0"><code>[Stack frame]              [Heap]            
ch1: Character {                             
    _flag: 2                                 
    name: String {                           
        _flag: 1         { _ref_count: 2     
        buffer: ────────┬─►&#39;D&#39;        👆     
        initialized: 7  │  ...      (This is 
        capacity: 8     │  &#39;p&#39; }     2 now!) 
    }                   │                    
    class: ...          │                    
    hp: 22              │                    
}                       │                    
                        │                    
ch2: Character {        │                    
    _flag: 2            │                    
    name: String {      │                    
        _flag: 1        │                    
        buffer: ────────┘                    
        initialized: 7                       
        capacity: 8                          
    }                                        
    class: ...                               
    hp: 22                                   
}                                            
</code></pre><h3 id="copying-out-the-name-field">Copying out the name field</h3>
<p>Now imagine we were to copy out the <em>name</em> field, instead of the entire character:</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give,shared,share" data-dada-types="String,Point,Vec,Map,Character,u32">let ch1 = Character(&#34;Duchamp&#34;, &#34;Dadaist&#34;, 22).share
let name = ch1.name</code></pre>
<p>&hellip;what happens is that:</p>
<ol>
<li>traversing <code>ch1</code>, we observe that the <code>_flag</code> field is 2 and therefore <code>ch1</code> is shared</li>
<li>we copy out the <code>String</code> fields from <code>name</code>. Because the character is shared:
<ul>
<li>we modify the <code>_flag</code> field on the new string to 2</li>
<li>we increment the ref-count for any heap values</li>
</ul>
</li>
</ol>
<p>The result is that you get:</p>
<pre tabindex="0"><code>[Stack frame]              [Heap]       
ch1: Character {                        
    _flag: 2                            
    name: String {                      
        _flag: 1         { _ref_count: 2
        buffer: ────────┬─►&#39;D&#39;          
        initialized: 7  │  ...          
        capacity: 8     │  &#39;p&#39; }        
    }                   │               
    class: ...          │               
    hp: 22              │               
}                       │               
                        │               
name: String {          │               
    _flag: 2            │               
    buffer: ────────────┘               
    initialized: 7                      
    capacity: 8                         
}                                       
</code></pre><h2 id="sharing-propagation-is-one-example-of-permission-propagation">&ldquo;Sharing propagation&rdquo; is one example of permission propagation</h2>
<p>This post showed how <code>shared</code> values in Dada work and showed how the <code>shared</code> permission <em>propagates</em> when you access a field. <em>Permissions</em> are how Dada manages object lifetimes. We&rsquo;ve seen two so far</p>
<ul>
<li>the <code>given</code> permission indicates a uniquely owned value (<code>T</code>, in Rust-speak);</li>
<li>the <code>shared</code> permission indicates a copyable value (<code>Arc&lt;T&gt;</code> is the closest Rust equivalent).</li>
</ul>
<p>In future posts we&rsquo;ll see the <code>ref</code> and <code>mut</code> permissions, which roughly correspond to <code>&amp;</code> and <code>&amp;mut</code>, and talk out how the whole thing fits together.</p>
<h2 id="dada-is-more-than-a-pretty-face">Dada is more than a pretty face</h2>
<p>This is the first post where we started to see a bit more of Dada&rsquo;s character. Reading over the previous few posts, you could be forgiven for thinking Dada was just a cute syntax atop familiar Rust semantics. But as you can see from how <code>shared</code> works, Dada is quite a bit more than that.</p>
<p>I like to think of Dada as &ldquo;opinionated Rust&rdquo; in some sense. Unlike Rust, it imposes some standards on how things are done. For example, every object (at least every object with a heap-allocated field) has a <code>_flag</code> field. And every heap allocation has a ref-count.</p>
<p>These conventions come at some modest runtime cost. My rule is that basic operations are allowed to do &ldquo;shallow&rdquo; operations, e.g., toggling the <code>_flag</code> or adjusting the ref-counts on every field. But they cannot do &ldquo;deep&rdquo; operations that require traversing heap structures.</p>
<p>In exchange for adopting conventions and paying that cost, you get &ldquo;composability&rdquo;, by which I mean that permissions in Dada (like <code>shared</code>) flow much more naturally, and types that are semantically equivalent (i.e., you can do the same things with them) generally have the same layout in memory.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Remember that I have not implemented all this, I am drawing on my memory and notes from my notebooks. I reserve the right to change any and everything as I go about implementing.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/dada" term="dada" label="Dada"/></entry><entry><title type="html">Dada: moves and mutation</title><link href="https://smallcultfollowing.com/babysteps/blog/2026/02/10/dada-moves-and-mutation/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2026/02/10/dada-moves-and-mutation/</id><published>2026-02-10T00:00:00+00:00</published><updated>2026-02-10T19:29:44-05:00</updated><content type="html"><![CDATA[<p>Let&rsquo;s continue with working through Dada. In my <a href="https://smallcultfollowing.com/babysteps/blog/2026/02/09/hello-dada/">previous post</a>, I introduced some string manipulation. Let&rsquo;s start talking about permissions. This is where Dada will start to resemble Rust a bit more.</p>
<h2 id="class-struggle">Class struggle</h2>
<p><strong>Classes</strong> in Dada are one of the basic ways that we declare new types (there are also enums, we&rsquo;ll get to that later).</p>
<p>The most convenient way to declare a class is to put the fields in parentheses. This implicitly declares a constructor at the same time:</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give" data-dada-types="String,Point,Vec,Map">class Point(x: u32, y: u32) {}</code></pre>
<p>This is in fact sugar for a more Rust like form:</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give" data-dada-types="String,Point,Vec,Map">class Point {
    x: u32
    y: u32
    fn new() -&gt; Point {
        Point { x, y }
    }
}</code></pre>
<p>And you can create an instance of a class by calling the constructor:</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give" data-dada-types="String,Point,Vec,Map">let p = Point(22, 44) // sugar for Point.new(22, 44)</code></pre>
<h2 id="mutating-fields">Mutating fields</h2>
<p>I can mutate the fields of <code>p</code> as you would expect:</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give" data-dada-types="String,Point,Vec,Map">p.x &#43;= 1
p.x = p.y</code></pre>
<h2 id="read-by-default">Read by default</h2>
<p>In Dada, the default when you declare a parameter is that you are getting read-only access:</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give" data-dada-types="String,Point,Vec,Map">fn print_point(p: Point) {
    print(&#34;The point is {p.x}, {p.y}&#34;)
}

let p = Point(22, 44)
print_point(p)</code></pre>
<p>If you attempt to mutate the fields of a parameter, that would get you an error:</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give" data-dada-types="String,Point,Vec,Map">fn print_point(p: Point) {
    p.x &#43;= 1 # &lt;-- ERROR!
}</code></pre>
<h2 id="use--to-mutate">Use <code>!</code> to mutate</h2>
<p>If you declare a parameter with <code>!</code>, then it becomes a mutable reference to a class instance from your caller:</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give" data-dada-types="String,Point,Vec,Map">fn translate_point(point!: Point, x: u32, y: u32) {
    point.x &#43;= x
    point.y &#43;= y
}</code></pre>
<p>In Rust, this would be like <code>point: &amp;mut Point</code>. When you call <code>translate_point</code>, you also put a <code>!</code> to indicate that you are <em>passing</em> a mutable reference:</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give" data-dada-types="String,Point,Vec,Map">let p = Point(22, 44)     # Create point
print_point(p)            # Prints 22, 44
translate_point(p!, 2, 2) # Mutate point
print_point(p)            # Prints 24, 46 </code></pre>
<p>As you can see, when <code>translate_point</code> modifies <code>p.x</code>, that changes <code>p</code> in place.</p>
<h2 id="moves-are-explicit">Moves are explicit</h2>
<p>If you&rsquo;re familiar with Rust, that last example may be a bit surprising. In Rust, a call like <code>print_point(p)</code> would <em>move</em> <code>p</code>, giving ownership away. Trying to use it later would give an error. That&rsquo;s because the default in Dada is to give a read-only reference, like <code>&amp;x</code> in Rust (this gives the right <em>intuition</em> but is also misleading; we&rsquo;ll see in a future post that <em>references</em> in Dada are different from Rust in one very important way).</p>
<p>If you have a function that needs ownership of its parameter, you declare that with <code>given</code>:</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give" data-dada-types="String,Point,Vec,Map">fn take_point(p: given Point) {
    // ...
}</code></pre>
<p>And on the caller&rsquo;s side, you call such a function with <code>.give</code>:</p>
<pre><code class="language-dada" data-dada-keywords="let,fn,class,given,give" data-dada-types="String,Point,Vec,Map">let p = Point(22, 44)
take_point(p.give)
take_point(p.give) # &lt;-- Error! Can&#39;t give twice.</code></pre>
<h2 id="comparing-with-rust">Comparing with Rust</h2>
<p>It&rsquo;s interesting to compare some Rust and Dada code side-by-side:</p>
<table>
  <thead>
      <tr>
          <th>Rust</th>
          <th>Dada</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><code>vec.len()</code></td>
          <td><code>vec.len()</code></td>
      </tr>
      <tr>
          <td><code>map.get(&amp;key)</code></td>
          <td><code>map.get(key)</code></td>
      </tr>
      <tr>
          <td><code>vec.push(element)</code></td>
          <td><code>vec!.push(element.give)</code></td>
      </tr>
      <tr>
          <td><code>vec.append(&amp;mut other)</code></td>
          <td><code>vec!.append(other!)</code></td>
      </tr>
      <tr>
          <td><code>message.send_to(&amp;channel)</code></td>
          <td><code>message.give.send_to(channel)</code></td>
      </tr>
  </tbody>
</table>
<h2 id="design-rationale-and-objectives">Design rationale and objectives</h2>
<h3 id="convenient-is-the-default">Convenient is the default</h3>
<p>The most convenient things are the shortest and most common. So we make reads the default.</p>
<h3 id="everything-is-explicit-but-unobtrusive">Everything is explicit but unobtrusive</h3>
<p>The <code>.</code> operator in Rust can do a wide variety of things depending on the method being called. It might mutate, move, create a temporary, etc. In Dada, these things are all visible at the callsite&ndash; but they are unobtrusive.</p>
<p>This actually dates from Dada&rsquo;s &ldquo;gradual programming&rdquo; days &ndash; after all, if you don&rsquo;t have type annotations on the method, then you can&rsquo;t decide <code>foo.bar()</code> should take a shared or mutable borrow of <code>foo</code>. So we needed a notation where everything is visible at the call-site and explicit.</p>
<h3 id="postfix-operators-play-more-nicely-with-others">Postfix operators play more nicely with others</h3>
<p>Dada tries hard to avoid prefix operators like <code>&amp;mut</code>, since they don&rsquo;t compose well with <code>.</code> notation.</p>]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/dada" term="dada" label="Dada"/></entry><entry><title type="html">Hello, Dada!</title><link href="https://smallcultfollowing.com/babysteps/blog/2026/02/09/hello-dada/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2026/02/09/hello-dada/</id><published>2026-02-09T00:00:00+00:00</published><updated>2026-02-09T06:03:21-05:00</updated><content type="html"><![CDATA[<img src="https://smallcultfollowing.com/babysteps/
/assets/2026-fun-with-dada/dada-logo.svg" width="20%" style="float: right; margin-right: 1em; margin-bottom: 0.5em;" />
<p>Following on my <a href="https://smallcultfollowing.com/babysteps/blog/2026/02/08/fun-with-dada/">Fun with Dada</a> post, this post is going to start teaching Dada. I&rsquo;m going to keep each post short &ndash; basically just what I can write while having my morning coffee.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<h2 id="you-have-the-right-to-write-code">You have the right to write code</h2>
<p>Here is a very first Dada program</p>
<pre><code class="language-dada" data-dada-keywords="let" data-dada-types="String">println(&#34;Hello, Dada!&#34;)</code></pre>
<p>I think all of you will be able to guess what it does. Still, there is something worth noting even in this simple program:</p>
<p><strong>&ldquo;You have the right to write code. If you don&rsquo;t write a <code>main</code> function explicitly, one will be provided for you.&rdquo;</strong> Early on I made the change to let users omit the <code>main</code> function and I was surprised by what a difference it made in how <em>light</em> the language felt. Easy change, easy win.</p>
<h2 id="convenient-is-the-default">Convenient is the default</h2>
<p>Here is another Dada program</p>
<pre><code class="language-dada" data-dada-keywords="let" data-dada-types="String">let name = &#34;Dada&#34;
println(&#34;Hello, {name}!&#34;)</code></pre>
<p>Unsurprisingly, this program does the same thing as the last one.</p>
<p><strong>&ldquo;Convenient is the default.&rdquo;</strong> Strings support interpolation (i.e., <code>{name}</code>) by default. In fact, that&rsquo;s not all they support, you can also break them across lines very conveniently. This program does the same thing as the others we&rsquo;ve seen:</p>
<pre><code class="language-dada" data-dada-keywords="let" data-dada-types="String">let name = &#34;Dada&#34;
println(&#34;
    Hello, {name}!
&#34;)</code></pre>
<p>When you have a <code>&quot;</code> immediately followed by a newline, the leading and trailing newline are stripped, along with the &ldquo;whitespace prefix&rdquo; from the subsequent lines. Internal newlines are kept, so something like this:</p>
<pre><code class="language-dada" data-dada-keywords="let" data-dada-types="String">let name = &#34;Dada&#34;
println(&#34;
    Hello, {name}!
    
    How are you doing?
&#34;)</code></pre>
<p>would print</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="cl">Hello, Dada!
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">How are you doing?
</span></span></code></pre></div><h2 id="just-one-familiar-string">Just one familiar <code>String</code></h2>
<p>Of course you could also annotate the type of the <code>name</code> variable explicitly:</p>
<pre><code class="language-dada" data-dada-keywords="let" data-dada-types="String">let name: String = &#34;Dada&#34;
println(&#34;Hello, {name}!&#34;)</code></pre>
<p>You will find that it is <code>String</code>. This in and of itself is not notable, unless you are accustomed to Rust, where the type would be <code>&amp;'static str</code>. This is of course a perennial stumbling block for new Rust users, but more than that, I find it to be a big <em>annoyance</em> &ndash; I hate that I have to write <code>&quot;Foo&quot;.to_string()</code> or <code>format!(&quot;Foo&quot;)</code> everywhere that I mix constant strings with strings that are constructed.</p>
<p>Similar to most modern languages, strings in Dada are immutable. So you can create them and copy them around:</p>
<pre><code class="language-dada" data-dada-keywords="let" data-dada-types="String">let name: String = &#34;Dada&#34;
let greeting: String = &#34;Hello, {name}&#34;
let name2: String = name</code></pre>
<h2 id="next-up-mutation-permissions">Next up: mutation, permissions</h2>
<p>OK, we really just scratched the surface here! This is just the &ldquo;friendly veneer&rdquo; of Dada, which looks and feels like a million other languages. Next time I&rsquo;ll start getting into the permission system and mutation, where things get a bit more interesting.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>My habit is to wake around 5am and spend the first hour of the day doing &ldquo;fun side projects&rdquo;. But for the last N months I&rsquo;ve actually been doing Rust stuff, like <a href="https://symposium.dev/">symposium.dev</a> and <a href="https://rust-lang.github.io/rust-project-goals/2026/">preparing the 2026 Rust Project Goals</a>. Both of these are super engaging, but all Rust and no play makes Niko a dull boy. Also a grouchy boy.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/dada" term="dada" label="Dada"/></entry><entry><title type="html">Fun With Dada</title><link href="https://smallcultfollowing.com/babysteps/blog/2026/02/08/fun-with-dada/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2026/02/08/fun-with-dada/</id><published>2026-02-08T00:00:00+00:00</published><updated>2026-02-08T21:20:47-05:00</updated><content type="html"><![CDATA[<img src="https://smallcultfollowing.com/babysteps/
/assets/2026-fun-with-dada/dada-logo.svg" width="20%" style="float: right; margin-right: 1em; margin-bottom: 0.5em;" />
<p>Waaaaaay back in 2021, I started experimenting with a new programming language I call <a href="https://dada-lang.org">&ldquo;Dada&rdquo;</a>. I&rsquo;ve been tinkering with it ever since and I just realized that (oh my gosh!) I&rsquo;ve never written even a single blog post about it! I figured I should fix that. This post will introduce some of the basic concepts of Dada as it is now.</p>
<p>Before you get any ideas, Dada isn&rsquo;t fit for use. In fact the compiler doesn&rsquo;t even really work because I keep changing the language before I get it all the way working. Honestly, Dada is more of a &ldquo;stress relief&rdquo; valve for me than anything else<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> &ndash; it&rsquo;s fun to tinker with a programming language where I don&rsquo;t have to worry about backwards compatibility, or RFCs, or anything else.</p>
<p>That said, Dada has been a very fertile source of ideas that I think could be applicable to Rust. And not just for language design: playing with the compiler is also what led to the <a href="https://smallcultfollowing.com/babysteps/blog/2022/08/18/come-contribute-to-salsa-2022/">new <code>salsa</code> design</a><sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>, which is now used by both rust-analyzer and <a href="https://github.com/astral-sh/ty">Astral&rsquo;s ty</a>. So I really want to get those ideas out there!</p>
<h2 id="i-took-a-break-but-im-back-baby">I took a break, but I&rsquo;m back baby!</h2>
<p>I stopped hacking on Dada about a year ago<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>, but over the last few days I&rsquo;ve started working on it again. And I realized, hey, this is a perfect time to start blogging! After all, I have to rediscover what I was doing anyway, and writing about things is always the best way to work out the details.</p>
<h2 id="dada-started-as-a-gradual-programming-experiment-but-no-longer">Dada started as a gradual programming experiment, but no longer</h2>
<p>Dada has gone through many phases. Early on, the goal was to build a <em>gradually typed</em> programming language that I thought would be easier for people to learn.</p>
<p>The idea was that you could start writing without any types at all and just execute the program. There was an interactive playground that would let you step through and visualize the &ldquo;borrow checker&rdquo; state (what Dada calls permissions) as you go. My hope was that people would find that easier to learn than working with type checker checker.</p>
<p>I got this working and it was actually pretty cool. <a href="https://www.youtube.com/watch?v=tdg03gEbyS8">I gave a talk about it at the Programming Language Mentoring Workshop in 2022</a>, though skimming that video it doesn&rsquo;t seem like I really demo&rsquo;d the permission modeling. Too bad.</p>
<p>At the same time, I found myself unconvinced that the gradually typed approach made sense. What I wanted was that when you executed the program without type annotations, you would get errors at the point where you violated a borrow. And that meant that the program had to track a lot of extra data, kind of like miri does, and it was really only practical as a teaching tool. I still would like to explore that, but it also felt like it was adding a lot of complexity to the language design for something that would only be of interest very early in a developer&rsquo;s journey<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup>.</p>
<p>Therefore, I decided to start over, this time, to just focus on the static type checking part of Dada.</p>
<h2 id="dada-is-like-a-streamlined-rust">Dada is like a streamlined Rust</h2>
<p>Dada today is like Rust but <em>streamlined</em>. The goal is that Dada has the same basic &ldquo;ownership-oriented&rdquo; <em>feel</em> of Rust, but with a lot fewer choices and nitty-gritty details you have to deal with.<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup></p>
<p>Rust often has types that are semantically equivalent, but different in representation. Consider <code>&amp;Option&lt;String&gt;</code> vs <code>Option&lt;&amp;String&gt;</code>: both of them are equivalent in terms of what you can do with them, but of course Rust makes you carefully distinguish between them. In Dada, they are the same type. Dada also makes <code>&amp;Vec&lt;String&gt;</code>, <code>&amp;Vec&lt;&amp;String&gt;</code>, <code>&amp;[String]</code>, <code>&amp;[&amp;str]</code>, and many other variations all the same type too. And before you ask, it does it without heap allocating everything or using a garbage collector.</p>
<p>To put it pithily, Dada aims to be <strong>&ldquo;Rust where you never have to call <code>as_ref()</code>&rdquo;.</strong></p>
<h2 id="dada-has-a-fancier-borrow-checker">Dada has a fancier borrow checker</h2>
<p>Dada also has a fancier borrow checker, one which already demonstrates much of <a href="https://smallcultfollowing.com/babysteps/blog/2024/06/02/the-borrow-checker-within/">the borrow checker within</a>, although it doesn&rsquo;t have view types. Dada&rsquo;s borrow checker supports <a href="https://smallcultfollowing.com/babysteps/blog/2024/06/02/the-borrow-checker-within/#step-4-internal-references">internal borrows</a> (e.g., you can make a struct that has fields that borrow from other fields) and it supports <a href="https://smallcultfollowing.com/babysteps/blog/2024/03/04/borrow-checking-without-lifetimes/">borrow checking without lifetimes</a>. Much of this stuff can be brought to Rust, although I did tweak a few things in Dada that made some aspects easier.</p>
<h2 id="dada-targets-webassembly-natively">Dada targets WebAssembly natively</h2>
<p>Somewhere along the line in refocusing Dada, I decided to focus exclusively on building WebAssembly components. Initially I felt like targeting WebAssembly would be really convenient:</p>
<ul>
<li>WebAssembly is like a really simple and clean assembly language, so writing the compiler backend is easy.</li>
<li>WebAssembly components are explicitly designed to bridge between languages, so they solve the FFI problem for you.</li>
<li>With WASI, you even get a full featured standard library that includes high-level things like &ldquo;fetch a web page&rdquo;. So you can build useful things right off the bat.</li>
</ul>
<h2 id="webassembly-and-on-demand-compilation--compile-time-reflection-almost-for-free">WebAssembly and on-demand compilation = compile-time reflection almost for free</h2>
<p>But I came to realize that targeting WebAssembly has another advantage: <strong>it makes compile-time reflection almost trivial</strong>. The Dada compiler is structured in a purely on-demand fashion. This means we can compile one function all the way to WebAssembly bytecode and leave the rest of the crate untouched.</p>
<p>And once we have the WebAssembly bytecode, we can run that from inside the compiler! With wasmtime, we have a high quality JIT that runs very fast. The code is even sandboxed!</p>
<p>So we can have a function that we compile and run during execution and use to produce other code that will be used by other parts of the compilation step. In other words, we get something like miri or Zig&rsquo;s comptime for free, essentially. Woah.</p>
<h2 id="wish-you-could-try-it-me-too">Wish you could try it? Me too!</h2>
<p>Man, writing this blog post made ME excited to play with Dada. Too bad it doesn&rsquo;t actually work. Ha! But I plan to keep plugging away on the compiler and get it to the point of a live demo as soon as I can. Hard to say exactly how long that will take.</p>
<p>In the meantime, to help me rediscover how things work, I&rsquo;m going to try to write up a series of blog posts about the type system, borrow checker, and the compiler architecture, all of which I think are pretty interesting.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Yes, I relax by designing new programming languages. Doesn&rsquo;t everyone?&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Designing a new version of <a href="https://salsa-rs.github.io/salsa/"><code>salsa</code></a> so that I could write the Dada compiler in the way I wanted really was an epic yak shave, now that I think about it.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>I lost motivation as I got <a href="https://smallcultfollowing.com/babysteps/blog/2025/02/10/love-the-llm/">interested in LLMs</a>. To be frank, I felt like I had to learn enough about them to understand if designing a programming language was &ldquo;fighting the last war&rdquo;. Having messed a bunch with LLMs, I definitely feel that they <a href="https://smallcultfollowing.com/babysteps/blog/2025/07/31/rs-py-ts-trifecta/">make the choice of programming language less relevant</a>. But I also think they really benefit from higher-level abstractions, even more than humans do, and so I like to think that Dada could still be useful. Besides, it&rsquo;s fun.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>And, with LLMs, that period of learning is shorter than ever.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>Of course this also makes Dada less flexible. I doubt a project like Rust for Linux would work with Dada.&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/dada" term="dada" label="Dada"/></entry><entry><title type="html">Move Expressions</title><link href="https://smallcultfollowing.com/babysteps/blog/2025/11/21/move-expressions/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2025/11/21/move-expressions/</id><published>2025-11-21T00:00:00+00:00</published><updated>2025-11-21T05:45:10-05:00</updated><content type="html"><![CDATA[<p>This post explores another proposal in the space of ergonomic ref-counting that I am calling <strong>move expressions</strong>. To my mind, these are an alternative to <a href="https://smallcultfollowing.com/babysteps/
/blog/2025/10/22/explicit-capture-clauses.html">explicit capture clauses</a>, one that addresses many (but not <em>all</em>) of the goals from that design with improved ergonomics and readability.</p>
<h2 id="tldr">TL;DR</h2>
<p>The idea itself is simple, within a closure (or future), we add the option to write <code>move($expr)</code>. This is a value expression (&ldquo;rvalue&rdquo;) that desugars into a temporary value that is moved into the closure. So</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="o">||</span><span class="w"> </span><span class="n">something</span><span class="p">(</span><span class="o">&amp;</span><span class="k">move</span><span class="p">(</span><span class="cp">$expr</span><span class="p">))</span><span class="w">
</span></span></span></code></pre></div><p>is roughly equivalent to something like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="p">{</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">tmp</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="cp">$expr</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">||</span><span class="w"> </span><span class="n">something</span><span class="p">(</span><span class="o">&amp;</span><span class="p">{</span><span class="n">tmp</span><span class="p">})</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="how-it-would-look-in-practice">How it would look in practice</h2>
<p>Let&rsquo;s go back to one of our running examples, the &ldquo;Cloudflare example&rdquo;, which originated in <a href="https://dioxus.notion.site/Dioxus-Labs-High-level-Rust-5fe1f1c9c8334815ad488410d948f05e">this excellent blog post by the Dioxus folks</a>. As a reminder, this is how the code looks <em>today</em> &ndash; note the <code>let _some_value = ...</code> lines for dealing with captures:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// task:  listen for dns connections
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">_some_a</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">some_a</span><span class="p">.</span><span class="n">clone</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">_some_b</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">some_b</span><span class="p">.</span><span class="n">clone</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">_some_c</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">some_c</span><span class="p">.</span><span class="n">clone</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">tokio</span>::<span class="n">task</span>::<span class="n">spawn</span><span class="p">(</span><span class="k">async</span><span class="w"> </span><span class="k">move</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  	</span><span class="n">do_something_else_with</span><span class="p">(</span><span class="n">_some_a</span><span class="p">,</span><span class="w"> </span><span class="n">_some_b</span><span class="p">,</span><span class="w"> </span><span class="n">_some_c</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">});</span><span class="w">
</span></span></span></code></pre></div><p>Under this proposal it would look something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">tokio</span>::<span class="n">task</span>::<span class="n">spawn</span><span class="p">(</span><span class="k">async</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">do_something_else_with</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">move</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">some_a</span><span class="p">.</span><span class="n">clone</span><span class="p">()),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">move</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">some_b</span><span class="p">.</span><span class="n">clone</span><span class="p">()),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">move</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">some_c</span><span class="p">.</span><span class="n">clone</span><span class="p">()),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">});</span><span class="w">
</span></span></span></code></pre></div><p>There are times when you would want multiple clones. For example, if you want to move something into a <code>FnMut</code> closure that will then give away a copy on each call, it might look like</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">data_source_iter</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">inspect</span><span class="p">(</span><span class="o">|</span><span class="n">item</span><span class="o">|</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">inspect_item</span><span class="p">(</span><span class="n">item</span><span class="p">,</span><span class="w"> </span><span class="k">move</span><span class="p">(</span><span class="n">tx</span><span class="p">.</span><span class="n">clone</span><span class="p">()).</span><span class="n">clone</span><span class="p">())</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//                      ----------  -------
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//                           |         |
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//                   move a clone      |
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//                   into the closure  |
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//                                     |
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//                             clone the clone
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//                             on each iteration
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">})</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">collect</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// some code that uses `tx` later...
</span></span></span></code></pre></div><h2 id="credit-for-this-idea">Credit for this idea</h2>
<p>This idea is not mine. It&rsquo;s been floated a number of times. The first time I remember hearing it was at the RustConf Unconf, but I feel like it&rsquo;s come up before that. Most recently it was <a href="https://rust-lang.zulipchat.com/#narrow/channel/410673-t-lang.2Fmeetings/topic/Design.20meeting.202025-08-27.3A.20Ergonomic.20RC/near/555236763">proposed by Zachary Harrold on Zulip</a>, who has also created a prototype called <a href="https://crates.io/crates/soupa">soupa</a>. Zachary&rsquo;s proposal, like earlier proposals I&rsquo;ve heard, used the <code>super</code> keyword. Later on <a href="https://rust-lang.zulipchat.com/#narrow/channel/410673-t-lang.2Fmeetings/topic/Design.20meeting.202025-08-27.3A.20Ergonomic.20RC/near/555643180">@simulacrum proposed using <code>move</code></a>, which to me is a major improvement, and that&rsquo;s the version I ran with here.</p>
<h2 id="this-proposal-makes-closures-more-continuous">This proposal makes closures more &ldquo;continuous&rdquo;</h2>
<p>The reason that I love the <code>move</code> variant of this proposal is that it makes closures more &ldquo;continuous&rdquo; and exposes their underlying model a bit more clearly. With this design, I would start by explaining closures with move expressions and just teach <code>move</code> closures at the end, as a convenient default:</p>
<blockquote>
<p>A Rust closure captures the places you use in the &ldquo;minimal way that it can&rdquo; &ndash; so <code>|| vec.len()</code> will capture a shared reference to the <code>vec</code>, <code>|| vec.push(22)</code> will capture a mutable reference, and <code>|| drop(vec)</code> will take ownership of the vector.</p>
<p>You can use <code>move</code> expressions to control exactly what is captured: so <code>|| move(vec).push(22)</code> will move the <code>vector</code> into the closure. A common pattern when you want to be fully explicit is to list all captures at the top of the closure, like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">vec</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">move</span><span class="p">(</span><span class="n">input</span><span class="p">.</span><span class="n">vec</span><span class="p">);</span><span class="w"> </span><span class="c1">// take full ownership of vec
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">move</span><span class="p">(</span><span class="o">&amp;</span><span class="n">cx</span><span class="p">.</span><span class="n">data</span><span class="p">);</span><span class="w"> </span><span class="c1">// take a reference to data
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">output_tx</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">move</span><span class="p">(</span><span class="n">output_tx</span><span class="p">);</span><span class="w"> </span><span class="c1">// take ownership of the output channel
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">process</span><span class="p">(</span><span class="o">&amp;</span><span class="n">vec</span><span class="p">,</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">output_tx</span><span class="p">,</span><span class="w"> </span><span class="n">data</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>As a shorthand, you can write <code>move ||</code> at the top of the closure, which will change the default so that closures &gt; take ownership of every captured variable. You can still mix-and-match with <code>move</code> expressions to get more control. &gt; So the previous closure might be written more concisely like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">move</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">process</span><span class="p">(</span><span class="o">&amp;</span><span class="n">input</span><span class="p">.</span><span class="n">vec</span><span class="p">,</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">output_tx</span><span class="p">,</span><span class="w"> </span><span class="k">move</span><span class="p">(</span><span class="o">&amp;</span><span class="n">cx</span><span class="p">.</span><span class="n">data</span><span class="p">))</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//       ---------       ---------       --------      
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//           |               |               |         
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//           |               |       closure still  
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//           |               |       captures a ref
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//           |               |       `&amp;cx.data`        
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//           |               |                         
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//       because of the `move` keyword on the clsoure,
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//       these two are captured &#34;by move&#34;
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//       
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div></blockquote>
<h2 id="this-proposal-makes-move-fit-in-for-me">This proposal makes <code>move</code> &ldquo;fit in&rdquo; for me</h2>
<p>It&rsquo;s a bit ironic that I like this, because it&rsquo;s doubling down on part of Rust&rsquo;s design that I was recently complaining about. In my earlier post on <a href="https://smallcultfollowing.com/babysteps/
/blog/2025/10/22/explicit-capture-clauses.html">Explicit Capture Clauses</a> I wrote that:</p>
<blockquote>
<p>To be honest, I don&rsquo;t like the choice of <code>move</code> because it&rsquo;s so <em>operational</em>. I think if I could go back, I would try to refashion our closures around two concepts</p>
<ul>
<li><em>Attached</em> closures (what we now call <code>||</code>) would <em>always</em> be tied to the enclosing stack frame. They&rsquo;d always have a lifetime even if they don&rsquo;t capture anything.</li>
<li><em>Detached</em> closures (what we now call <code>move ||</code>) would capture by-value, like <code>move</code> today.</li>
</ul>
<p>I think this would help to build up the intuition of &ldquo;use <code>detach ||</code> if you are going to return the closure from the current stack frame and use <code>||</code> otherwise&rdquo;.</p>
</blockquote>
<p><code>move</code> expressions are, I think, moving in the opposite direction. Rather than talking about attached and detached, they bring us to a more unified notion of closures, one where you don&rsquo;t have &ldquo;ref closures&rdquo; and &ldquo;move closures&rdquo; &ndash; you just have closures that sometimes capture moves, and a &ldquo;move&rdquo; closure is just a shorthand for using <code>move</code> expressions everywhere. This is in fact how closures work in the compiler under the hood, and I think it&rsquo;s quite elegant.</p>
<h2 id="why-not-suffix">Why not suffix?</h2>
<p>One question is whether a <code>move</code> expression should be a <em>prefix</em> or a <em>postfix</em> operator. So e.g.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="o">||</span><span class="w"> </span><span class="n">something</span><span class="p">(</span><span class="o">&amp;</span><span class="cp">$expr</span><span class="p">.</span><span class="k">move</span><span class="p">)</span><span class="w">
</span></span></span></code></pre></div><p>instead of <code>&amp;move($expr)</code>.</p>
<p>My feeling is that it&rsquo;s not a good fit for a postfix operator because it doesn&rsquo;t just take the final value of the expression and so something with it, it actually impacts when the entire expression is evaluated. Consider this example:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="o">||</span><span class="w"> </span><span class="n">process</span><span class="p">(</span><span class="n">foo</span><span class="p">(</span><span class="n">bar</span><span class="p">()).</span><span class="k">move</span><span class="p">)</span><span class="w">
</span></span></span></code></pre></div><p>When does <code>bar()</code> get called? If you think about it, it has to be closure creation time, but it&rsquo;s not very &ldquo;obvious&rdquo;.</p>
<p>We reached a similar conclusion when we were considering <code>.unsafe</code> operators. I think there is a rule of thumb that things which delineate a &ldquo;scope&rdquo; of code ought to be prefix &ndash; though I suspect <code>unsafe(expr)</code> might actually be nice, and not just <code>unsafe { expr }</code>.</p>
<p><em>Edit:</em> I added this section after-the-fact in response to questions.</p>
<h2 id="conclusion">Conclusion</h2>
<p>I&rsquo;m going to wrap up this post here. To be honest, what this design really has going for it, above anything else, is its <em>simplicity</em> and the way it <em>generalizes Rust&rsquo;s existing design</em>. I love that. To me, it joins the set of &ldquo;yep, we should clearly do that&rdquo; pieces in this puzzle:</p>
<ul>
<li>Add a <code>Share</code> trait (I&rsquo;ve gone back to preferring the name <code>share</code> &#x1f601;)</li>
<li>Add <code>move</code> expressions</li>
</ul>
<p>These both seem like solid steps forward. I am not yet persuaded that they get us all the way to the goal that I articulated in <a href="https://smallcultfollowing.com/babysteps/
/blog/2025/10/13/ergonomic-explicit-handles/">an earlier post</a>:</p>
<blockquote>
<p>&ldquo;low-level enough for a Kernel, usable enough for a GUI&rdquo;</p>
</blockquote>
<p>but they are moving in the right direction.</p>]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/ergonomic-rc" term="ergonomic-rc" label="Ergonomic RC"/></entry><entry><title type="html">Just call clone (or alias)</title><link href="https://smallcultfollowing.com/babysteps/blog/2025/11/10/just-call-clone/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2025/11/10/just-call-clone/</id><published>2025-11-10T00:00:00+00:00</published><updated>2025-11-10T13:55:41-05:00</updated><content type="html"><![CDATA[<img src="https://smallcultfollowing.com/babysteps/
/assets/2025-justcallclone/keep-calm-and-call-clone-rendered.svg" width="20%" style="float: right; margin-right: 1em; margin-bottom: 0.5em;" />
<p>Continuing my series on ergonomic ref-counting, I want to explore another idea, one that I&rsquo;m calling &ldquo;just call clone (or alias)&rdquo;. This proposal specializes the <code>clone</code> and <code>alias</code> methods so that, in a new edition, the compiler will (1) remove redundant or unnecessary calls (with a lint); and (2) automatically capture clones or aliases in <code>move</code> closures where needed.</p>
<p>The goal of this proposal is to simplify the user&rsquo;s mental model: whenever you see an error like &ldquo;use of moved value&rdquo;, the fix is always the same: just call <code>clone</code> (or <code>alias</code>, if applicable). This model is aiming for the balance of <a href="https://smallcultfollowing.com/babysteps/
/blog/2025/10/13/ergonomic-explicit-handles/">&ldquo;low-level enough for a Kernel, usable enough for a GUI&rdquo;</a> that I described earlier. It&rsquo;s also making a statement, which is that the key property we want to preserve is that <em>you can always find where new aliases might be created</em> &ndash; but that it&rsquo;s ok if the fine-grained details around <em>exactly when</em> the alias is created is a bit subtle.</p>
<!-- more -->
<h2 id="the-proposal-in-a-nutshell">The proposal in a nutshell</h2>
<h3 id="part-1-closure-desugaring-that-is-aware-of-clones-and-aliases">Part 1: Closure desugaring that is aware of clones and aliases</h3>
<p>Consider this <code>move</code> future:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">spawn_services</span><span class="p">(</span><span class="n">cx</span>: <span class="kp">&amp;</span><span class="nc">Context</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">tokio</span>::<span class="n">task</span>::<span class="n">spawn</span><span class="p">(</span><span class="k">async</span><span class="w"> </span><span class="k">move</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//                   ---- move future
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">manage_io</span><span class="p">(</span><span class="n">cx</span><span class="p">.</span><span class="n">io_system</span><span class="p">.</span><span class="n">alias</span><span class="p">(),</span><span class="w"> </span><span class="n">cx</span><span class="p">.</span><span class="n">request_name</span><span class="p">.</span><span class="n">clone</span><span class="p">());</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//        --------------------  -----------------------
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Because this is a <code>move</code> future, this takes ownership of <code>cx.io_system</code> and <code>cx_request_name</code>. Because <code>cx</code> is a borrowed reference, this will be an error unless those values are <code>Copy</code> (which they presumably are not). Under this proposal, capturing <em>aliases</em> or <em>clones</em> in a <code>move</code> closure/future would result in capturing an <em>alias</em> or <em>clone</em> of the place. So this future would be desugared like so (using <a href="https://smallcultfollowing.com/babysteps/
/blog/2025/10/22/explicit-capture-clauses/">explicit capture clause strawman notation</a>):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">spawn_services</span><span class="p">(</span><span class="n">cx</span>: <span class="kp">&amp;</span><span class="nc">Context</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">tokio</span>::<span class="n">task</span>::<span class="n">spawn</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">async</span><span class="w"> </span><span class="k">move</span><span class="p">(</span><span class="n">cx</span><span class="p">.</span><span class="n">io_system</span><span class="p">.</span><span class="n">alias</span><span class="p">(),</span><span class="w"> </span><span class="n">cx</span><span class="p">.</span><span class="n">request_name</span><span class="p">.</span><span class="n">clone</span><span class="p">())</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//     --------------------  -----------------------
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//     capture alias/clone respectively
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">manage_io</span><span class="p">(</span><span class="n">cx</span><span class="p">.</span><span class="n">io_system</span><span class="p">.</span><span class="n">alias</span><span class="p">(),</span><span class="w"> </span><span class="n">cx</span><span class="p">.</span><span class="n">request_name</span><span class="p">.</span><span class="n">clone</span><span class="p">());</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="part-2-last-use-transformation">Part 2: Last-use transformation</h3>
<p>Now, this result is inefficient &ndash; there are now <em>two</em> aliases/clones. So the next part of the proposal is that the compiler would, in newer Rust editions, apply a new transformat called the <strong>last-use transformation</strong>. This transformation would identify calls to <code>alias</code> or <code>clone</code> that are not needed to satisfy the borrow checker and remove them. This code would therefore become:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">spawn_services</span><span class="p">(</span><span class="n">cx</span>: <span class="kp">&amp;</span><span class="nc">Context</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">tokio</span>::<span class="n">task</span>::<span class="n">spawn</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">async</span><span class="w"> </span><span class="k">move</span><span class="p">(</span><span class="n">cx</span><span class="p">.</span><span class="n">io_system</span><span class="p">.</span><span class="n">alias</span><span class="p">(),</span><span class="w"> </span><span class="n">cx</span><span class="p">.</span><span class="n">request_name</span><span class="p">.</span><span class="n">clone</span><span class="p">())</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">manage_io</span><span class="p">(</span><span class="n">cx</span><span class="p">.</span><span class="n">io_system</span><span class="p">,</span><span class="w"> </span><span class="n">cx</span><span class="p">.</span><span class="n">request_name</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//        ------------  ---------------
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//        converted to moves
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The last-use transformation would apply beyond closures. Given an example like this one, which clones <code>id</code> even though <code>id</code> is never used later:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">send_process_identifier_request</span><span class="p">(</span><span class="n">id</span>: <span class="nb">String</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">request</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Request</span>::<span class="n">ProcessIdentifier</span><span class="p">(</span><span class="n">id</span><span class="p">.</span><span class="n">clone</span><span class="p">());</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//                                       ----------
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//                                       unnecessary
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">send_request</span><span class="p">(</span><span class="n">request</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>the user would get a warning like so<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>:</p>
<pre tabindex="0"><code>warning: unnecessary `clone` call will be converted to a move
 --&gt; src/main.rs:7:40
  |
8 |     let request = Request::ProcessIdentifier(id.clone());
  |                                              ^^^^^^^^^^ unnecessary call to `clone`
  |
  = help: the compiler automatically removes calls to `clone` and `alias` when not
    required to satisfy the borrow checker
help: change `id.clone()` to `id` for greater clarity
  |
8 -     let request = Request::ProcessIdentifier(id.clone());
8 +     let request = Request::ProcessIdentifier(id);
  |
</code></pre><p>and the code would be transformed so that it simply does a move:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">send_process_identifier_request</span><span class="p">(</span><span class="n">id</span>: <span class="nb">String</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">request</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Request</span>::<span class="n">ProcessIdentifier</span><span class="p">(</span><span class="n">id</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//                                       --
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//                                   transformed
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">send_request</span><span class="p">(</span><span class="n">request</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="mental-model-just-call-clone-or-alias">Mental model: just call &ldquo;clone&rdquo; (or &ldquo;alias&rdquo;)</h2>
<p>The goal of this proposal is that, when you get an error about a use of moved value, or moving borrowed content, the fix is always the same: you just call <code>clone</code> (or <code>alias</code>). It doesn&rsquo;t matter whether that error occurs in the regular function body or in a closure or in a future, the compiler will insert the clones/aliases needed to ensure future users of that same place have access to it (and no more than that).</p>
<p>I believe this will be helpful for new users. Early in their Rust journey new users are often sprinkling calls to clone as well as sigils like <code>&amp;</code> in more-or-less at random as they try to develop a firm mental model &ndash; this is where the <a href="https://keepcalmandcallclone.website/">&ldquo;keep calm and call clone&rdquo;</a> joke comes from. This approach breaks down around closures and futures today. Under this proposal, it will work, but users will <em>also</em> benefit from warnings indicating unnecessary clones, which I think will help them to understand where clone is really <em>needed</em>.</p>
<h2 id="experienced-users-can-trust-the-compiler-to-get-it-right">Experienced users can trust the compiler to get it right</h2>
<p>But the real question is how this works for <em>experienced users</em>. I&rsquo;ve been thinking about this a lot! I think this approach fits pretty squarely in the classic Bjarne Stroustrup definition of a zero-cost abstraction:</p>
<blockquote>
<p>&ldquo;What you don&rsquo;t use, you don&rsquo;t pay for. And further: What you do use, you couldn&rsquo;t hand code any better.&rdquo;</p>
</blockquote>
<p>The first half is clearly satisfied. If you don&rsquo;t call <code>clone</code> or <code>alias</code>, this proposal has no impact on your life.</p>
<p>The key point is the second half: earlier versions of this proposal were more simplistic, and would sometimes result in redundant or unnecessary clones and aliases. Upon reflection, I decided that this was a non-starter. The only way this proposal works is if experienced users know there is <strong>no performance advantage to using the more explicit form</strong>.This is precisely what we have with, say, iterators, and I think it works out very well. I believe this proposal hits that mark, but I&rsquo;d like to hear if there are things I&rsquo;m overlooking.</p>
<h2 id="the-last-use-transformation-codifies-a-widespread-intuition-that-clone-is-never-necessary">The <em>last-use transformation</em> codifies a widespread intuition, that <code>clone</code> is never <em>necessary</em></h2>
<p>I think most users would expect that changing <code>message.clone()</code> to just <code>message</code> is fine, as long as the code keeps compiling. But in fact nothing <em>requires</em> that to be the case. Under this proposal, APIs that make <code>clone</code> significant in unusual ways would be more annoying to use in the new Rust edition and I expect ultimately wind up getting changed so that &ldquo;significant clones&rdquo; have another name. I think this is a good thing.</p>
<h2 id="frequently-asked-questions">Frequently asked questions</h2>
<p>I think I&rsquo;ve covered the key points. Let me dive into some of the details here with a FAQ.</p>
<h3 id="can-you-summarize-all-of-these-posts-youve-been-writing-its-a-lot-to-digest">Can you summarize all of these posts you&rsquo;ve been writing? It&rsquo;s a lot to digest!</h3>
<p>I get it, I&rsquo;ve been throwing a lot of things out there. Let me begin by recapping the motivation as I see it:</p>
<ul>
<li>I believe our goal should be to focus first on a design that is <a href="https://smallcultfollowing.com/babysteps/
/blog/2025/10/13/ergonomic-explicit-handles/">&ldquo;low-level enough for a Kernel, usable enough for a GUI&rdquo;</a>.
<ul>
<li>The key part here is the word <em>enough</em>. We need to make sure that low-level details are exposed, but only those that truly matter. And we need to make sure that it&rsquo;s ergonomic to use, but it doesn&rsquo;t have to be as nice as TypeScript (though that would be great).</li>
</ul>
</li>
<li>Rust&rsquo;s current approach to <code>Clone</code> fails both groups of users;
<ul>
<li>calls to <code>clone</code> are not explicit enough for kernels and low-level software: when you see <code>something.clone()</code>, you don&rsquo;t know that is creating a new alias or an entirely distinct value, and you don&rsquo;t have any clue what it will cost at runtime. There&rsquo;s a reason much of the community recommends writing <code>Arc::clone(&amp;something)</code> instead.</li>
<li>calls to <code>clone</code>, particularly in closures, are a <strong>major ergonomic pain point</strong>, this has been a clear consensus since we first started talking about this issue.</li>
</ul>
</li>
</ul>
<p>I then proposed a set of three changes to address these issues, authored in individual blog posts:</p>
<ul>
<li>First, we <a href="https://smallcultfollowing.com/babysteps/
/blog/2025/10/07/the-handle-trait/">introduce the <code>Alias</code> trait (originally called <code>Handle</code>)</a>. The <code>Alias</code> trait introduces a new method <code>alias</code> that is equivalent to <code>clone</code> but indicates that this will be creating a second alias of the same underlying value.</li>
<li>Second, we introduce <a href="https://smallcultfollowing.com/babysteps/
/blog/2025/10/22/explicit-capture-clauses/">explicit capture clauses</a>, which lighten the syntactic load of capturing a clone or alias, make it possible to declare up-front the full set of values captured by a closure/future, and will support other kinds of handy transformations (e.g., capturing the result of <code>as_ref</code> or <code>to_string</code>).</li>
<li>Finally, we introduce the <strong>just call clone</strong> proposal described in this post. This modifies closure desugaring to recognize clones/aliases and also applies the last-use transformation to replace calls to clone/alias with moves where possible.</li>
</ul>
<h3 id="what-would-it-feel-like-if-we-did-all-those-things">What would it feel like if we did all those things?</h3>
<p>Let&rsquo;s look at the impact of each set of changes by walking through the &ldquo;Cloudflare example&rdquo;, which originated in <a href="https://dioxus.notion.site/Dioxus-Labs-High-level-Rust-5fe1f1c9c8334815ad488410d948f05e">this excellent blog post by the Dioxus folks</a>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">some_value</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Arc</span>::<span class="n">new</span><span class="p">(</span><span class="n">something</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// task 1
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">_some_value</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">some_value</span><span class="p">.</span><span class="n">clone</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">tokio</span>::<span class="n">task</span>::<span class="n">spawn</span><span class="p">(</span><span class="k">async</span><span class="w"> </span><span class="k">move</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">do_something_with</span><span class="p">(</span><span class="n">_some_value</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// task 2:  listen for dns connections
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">_some_a</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">some_a</span><span class="p">.</span><span class="n">clone</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">_some_b</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">some_b</span><span class="p">.</span><span class="n">clone</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">_some_c</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">some_c</span><span class="p">.</span><span class="n">clone</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">tokio</span>::<span class="n">task</span>::<span class="n">spawn</span><span class="p">(</span><span class="k">async</span><span class="w"> </span><span class="k">move</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  	</span><span class="n">do_something_else_with</span><span class="p">(</span><span class="n">_some_a</span><span class="p">,</span><span class="w"> </span><span class="n">_some_b</span><span class="p">,</span><span class="w"> </span><span class="n">_some_c</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">});</span><span class="w">
</span></span></span></code></pre></div><p>As the original blog post put it:</p>
<blockquote>
<p>Working on this codebase was demoralizing. We could think of no better way to architect things - we needed listeners for basically everything that filtered their updates based on the state of the app. You could say “lol get gud,” but the engineers on this team were the sharpest people I’ve ever worked with. Cloudflare is all-in on Rust. They’re willing to throw money at codebases like this. Nuclear fusion won’t be solved with Rust if this is how sharing state works.</p>
</blockquote>
<p>Applying the <a href="https://smallcultfollowing.com/babysteps/
/blog/2025/10/07/the-handle-trait/"><code>Alias</code> trait</a> and <a href="https://smallcultfollowing.com/babysteps/
/blog/2025/10/22/explicit-capture-clauses/">explicit capture clauses</a> makes for a modest improvement. You can now clearly see that the calls to <code>clone</code> are <code>alias</code> calls, and you don&rsquo;t have the awkward <code>_some_value </code>and <code>_some_a</code> variables. However, the code is still pretty verbose:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">some_value</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Arc</span>::<span class="n">new</span><span class="p">(</span><span class="n">something</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// task 1
</span></span></span><span class="line"><span class="cl"><span class="n">tokio</span>::<span class="n">task</span>::<span class="n">spawn</span><span class="p">(</span><span class="k">async</span><span class="w"> </span><span class="k">move</span><span class="p">(</span><span class="n">some_value</span><span class="p">.</span><span class="n">alias</span><span class="p">())</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">do_something_with</span><span class="p">(</span><span class="n">some_value</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// task 2:  listen for dns connections
</span></span></span><span class="line"><span class="cl"><span class="n">tokio</span>::<span class="n">task</span>::<span class="n">spawn</span><span class="p">(</span><span class="k">async</span><span class="w"> </span><span class="k">move</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">self</span><span class="p">.</span><span class="n">some_a</span><span class="p">.</span><span class="n">alias</span><span class="p">(),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">self</span><span class="p">.</span><span class="n">some_b</span><span class="p">.</span><span class="n">alias</span><span class="p">(),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">self</span><span class="p">.</span><span class="n">some_c</span><span class="p">.</span><span class="n">alias</span><span class="p">(),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  	</span><span class="n">do_something_else_with</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">some_a</span><span class="p">,</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">some_b</span><span class="p">,</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">some_c</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">});</span><span class="w">
</span></span></span></code></pre></div><p>Applying the Just Call Clone proposal removes a lot of boilerplate and, I think, captures the <em>intent</em> of the code very well. It also retains quite a bit of explicitness, in that searching for calls to <code>alias</code> reveals all the places that aliases will be created. However, it does introduce a bit of subtlety, since (e.g.) the call to <code>self.some_a.alias()</code> will actually occur when the future is <em>created</em> and not when it is <em>awaited</em>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">some_value</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Arc</span>::<span class="n">new</span><span class="p">(</span><span class="n">something</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// task 1
</span></span></span><span class="line"><span class="cl"><span class="n">tokio</span>::<span class="n">task</span>::<span class="n">spawn</span><span class="p">(</span><span class="k">async</span><span class="w"> </span><span class="k">move</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">do_something_with</span><span class="p">(</span><span class="n">some_value</span><span class="p">.</span><span class="n">alias</span><span class="p">());</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// task 2:  listen for dns connections
</span></span></span><span class="line"><span class="cl"><span class="n">tokio</span>::<span class="n">task</span>::<span class="n">spawn</span><span class="p">(</span><span class="k">async</span><span class="w"> </span><span class="k">move</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  	</span><span class="n">do_something_else_with</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">some_a</span><span class="p">.</span><span class="n">alias</span><span class="p">(),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">some_b</span><span class="p">.</span><span class="n">alias</span><span class="p">(),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">some_c</span><span class="p">.</span><span class="n">alias</span><span class="p">(),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">});</span><span class="w">
</span></span></span></code></pre></div><h3 id="im-worried-that-the-execution-order-of-calls-to-alias-will-be-too-subtle-how-is-thie-explicit-enough-for-low-level-code">I&rsquo;m worried that the execution order of calls to alias will be too subtle. How is thie &ldquo;explicit enough for low-level code&rdquo;?</h3>
<p>There is no question that Just Call Clone makes closure/future desugaring more subtle. Looking at task 1:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">tokio</span>::<span class="n">task</span>::<span class="n">spawn</span><span class="p">(</span><span class="k">async</span><span class="w"> </span><span class="k">move</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">do_something_with</span><span class="p">(</span><span class="n">some_value</span><span class="p">.</span><span class="n">alias</span><span class="p">());</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">});</span><span class="w">
</span></span></span></code></pre></div><p>this gets desugared to a call to <code>alias</code> when the future is <em>created</em> (not when it is <em>awaited</em>). Using the explicit form:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">tokio</span>::<span class="n">task</span>::<span class="n">spawn</span><span class="p">(</span><span class="k">async</span><span class="w"> </span><span class="k">move</span><span class="p">(</span><span class="n">some_value</span><span class="p">.</span><span class="n">alias</span><span class="p">())</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">do_something_with</span><span class="p">(</span><span class="n">some_value</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">});</span><span class="w">
</span></span></span></code></pre></div><p>I can definitely imagine people getting confused at first &ndash; &ldquo;but that call to <code>alias</code> looks like its inside the future (or closure), how come it&rsquo;s occuring earlier?&rdquo;</p>
<p><strong>Yet, the code really seems to preserve what is most important:</strong> when I search the codebase for calls to <code>alias</code>, I will find that an alias is creating for this task. And for the vast majority of real-world examples, the distinction of whether an alias is creating <em>when the task is spawned</em> versus <em>when it executes</em> doesn&rsquo;t matter. Look at this code: the important thing is that <code>do_something_with</code> is called with an alias of <code>some_value</code>, so <code>some_value</code> will stay alive as long as <code>do_something_else</code> is executing. It doesn&rsquo;t really matter how the &ldquo;plumbing&rdquo; worked.</p>
<h3 id="what-about-futures-that-conditionally-alias-a-value">What about futures that <em>conditionally</em> alias a value?</h3>
<p>Yeah, good point, those kind of examples have more room for confusion. Like look at this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">tokio</span>::<span class="n">task</span>::<span class="n">spawn</span><span class="p">(</span><span class="k">async</span><span class="w"> </span><span class="k">move</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">if</span><span class="w"> </span><span class="kc">false</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">do_something_with</span><span class="p">(</span><span class="n">some_value</span><span class="p">.</span><span class="n">alias</span><span class="p">());</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">});</span><span class="w">
</span></span></span></code></pre></div><p>In this example, there is code that uses <code>some_value</code> with an alias, but only under <code>if false</code>. So what happens? I would assume that indeed the future <em>will</em> capture an alias of <code>some_value</code>, in just the same way that this future will <em>move</em> <code>some_value</code>, even though the relevant code is dead:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">tokio</span>::<span class="n">task</span>::<span class="n">spawn</span><span class="p">(</span><span class="k">async</span><span class="w"> </span><span class="k">move</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">if</span><span class="w"> </span><span class="kc">false</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">do_something_with</span><span class="p">(</span><span class="n">some_value</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">});</span><span class="w">
</span></span></span></code></pre></div><h3 id="can-you-give-more-details-about-the-closure-desugaring-you-imagine">Can you give more details about the closure desugaring you imagine?</h3>
<p>Yep! I am thinking of something like this:</p>
<ul>
<li>If there is an <a href="https://smallcultfollowing.com/babysteps/
/blog/2025/10/22/explicit-capture-clauses/">explicit capture clause</a>, use that.</li>
<li>Else:
<ul>
<li>For non-<code>move</code> closures/futures, no changes, so
<ul>
<li>Categorize usage of each place and pick the &ldquo;weakest option&rdquo; that is available:
<ul>
<li>by ref</li>
<li>by mut ref</li>
<li>moves</li>
</ul>
</li>
</ul>
</li>
<li>For <code>move</code> closures/futures, we would change
<ul>
<li>Categorize usage of each place <code>P</code> and decide whether to capture that place&hellip;
<ul>
<li><em>by clone</em>, there is at least one call <code>P.clone()</code> or <code>P.alias()</code> and all other usage of <code>P</code> requires only a shared ref (reads)</li>
<li><em>by move</em>, if there are no calls to <code>P.clone()</code> or <code>P.alias()</code> or if there are usages of <code>P</code> that require ownership or a mutable reference</li>
</ul>
</li>
<li>Capture by clone/alias when a place <code>a.b.c</code> is only used via shared references, and at least one of those is a clone or alias.
<ul>
<li>For the purposes of this, accessing a &ldquo;prefix place&rdquo; <code>a</code> or a &ldquo;suffix place&rdquo; <code>a.b.c.d</code> is also considered an access to <code>a.b.c</code>.</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
</ul>
<p>Examples that show some edge cased:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">if</span><span class="w"> </span><span class="n">consume</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">x</span><span class="p">.</span><span class="n">foo</span><span class="p">().</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="why-not-do-something-similar-for-non-move-closures">Why not do something similar for non-move closures?</h3>
<p>In the relevant cases, non-move closures will already just capture by shared reference. This means that later attempts to use that variable will generally succeed:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">f</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">async</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//  ----- NOT async move
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">self</span><span class="p">.</span><span class="n">some_a</span><span class="p">.</span><span class="n">alias</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">do_something_else</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">some_a</span><span class="p">.</span><span class="n">alias</span><span class="p">());</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">//                ----------- later use succeeds
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">f</span><span class="p">.</span><span class="k">await</span><span class="p">;</span><span class="w">
</span></span></span></code></pre></div><p>This future does not need to take ownership of <code>self.some_a</code> to create an alias, so it will just capture a <em>reference</em> to <code>self.some_a</code>. That means that later uses of <code>self.some_a</code> can still compile, no problem. If this had been a move closure, however, that code above would currently not compile.</p>
<p>There is an edge case where you might get an error, which is when you are <em>moving</em>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">f</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">async</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">self</span><span class="p">.</span><span class="n">some_a</span><span class="p">.</span><span class="n">alias</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">do_something_else</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">some_a</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">//                ----------- move!
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">f</span><span class="p">.</span><span class="k">await</span><span class="p">;</span><span class="w">
</span></span></span></code></pre></div><p>In that case, you can make this an <code>async move</code> closure and/or use an explicit capture clause:</p>
<h3 id="can-you-give-more-details-about-the-last-use-transformation-you-imagine">Can you give more details about the last-use transformation you imagine?</h3>
<p>Yep! We would during codegen identify candidate calls to <code>Clone::clone</code> or <code>Alias::alias</code>. After borrow check has executed, we would examine each of the callsites and check the borrow check information to decide:</p>
<ul>
<li>Will this place be accessed later?</li>
<li>Will some reference potentially referencing this place be accessed later?</li>
</ul>
<p>If the answer to both questions is no, then we will replace the call with a move of the original place.</p>
<p>Here are some examples:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">borrow</span><span class="p">(</span><span class="n">message</span>: <span class="nc">Message</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">String</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">method</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">message</span><span class="p">.</span><span class="n">method</span><span class="p">.</span><span class="n">to_string</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">send_message</span><span class="p">(</span><span class="n">message</span><span class="p">.</span><span class="n">clone</span><span class="p">());</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//           ---------------
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//           would be transformed to
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//           just `message`
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">method</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">borrow</span><span class="p">(</span><span class="n">message</span>: <span class="nc">Message</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">String</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">send_message</span><span class="p">(</span><span class="n">message</span><span class="p">.</span><span class="n">clone</span><span class="p">());</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//           ---------------
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//           cannot be transformed
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//           since `message.method` is
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//           referenced later
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">message</span><span class="p">.</span><span class="n">method</span><span class="p">.</span><span class="n">to_string</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">borrow</span><span class="p">(</span><span class="n">message</span>: <span class="nc">Message</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">String</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">r</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">message</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">send_message</span><span class="p">(</span><span class="n">message</span><span class="p">.</span><span class="n">clone</span><span class="p">());</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//           ---------------
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//           cannot be transformed
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//           since `r` may reference
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//           `message` and is used later.
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">r</span><span class="p">.</span><span class="n">method</span><span class="p">.</span><span class="n">to_string</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="why-are-you-calling-it-the-last-use-transformation-and-not-optimization">Why are you calling it the <em>last-use transformation</em> and not <em>optimization</em>?</h3>
<p>In the past, I&rsquo;ve talked about the last-use <em>transformation</em> as an <em>optimization</em> &ndash; but I&rsquo;m changing terminology here. This is because, typically, an <em>optimization</em> is supposed to be unobservable to users except through measurements of execution time (or though UB), and that is clearly not the case here. The transformation would be a mechanical transformation performed by the compiler in a deterministic fashion.</p>
<h3 id="would-the-transformation-see-through-references">Would the transformation &ldquo;see through&rdquo; references?</h3>
<p>I think yes, but in a limited way. In other words I would expect</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="nb">Clone</span>::<span class="n">clone</span><span class="p">(</span><span class="o">&amp;</span><span class="n">foo</span><span class="p">)</span><span class="w">
</span></span></span></code></pre></div><p>and</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">p</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">foo</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nb">Clone</span>::<span class="n">clone</span><span class="p">(</span><span class="n">p</span><span class="p">)</span><span class="w">
</span></span></span></code></pre></div><p>to be transformed in the same way (replaced with <code>foo</code>), and the same would apply to more levels of intermediate usage. This would kind of &ldquo;fall out&rdquo; from the MIR-based optimization technique I imagine. It doesn&rsquo;t have to be this way, we could be more particular about the syntax that people wrote, but I think that would be surprising.</p>
<p>On the other hand, you could still fool it e.g. like so</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">identity</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="n">x</span>: <span class="kp">&amp;</span><span class="nc">T</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kp">&amp;</span><span class="nc">T</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">identity</span><span class="p">(</span><span class="o">&amp;</span><span class="n">foo</span><span class="p">).</span><span class="n">clone</span><span class="p">()</span><span class="w">
</span></span></span></code></pre></div><h3 id="would-the-transformation-apply-across-function-boundaries">Would the transformation apply across function boundaries?</h3>
<p>The way I imagine it, no. The transformation would be local to a function body. This means that one could write a <code>force_clone</code> method like so that &ldquo;hides&rdquo; the clone in a way that it will never be transformed away (this is an important capability for edition transformations!):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">pipe</span><span class="o">&lt;</span><span class="n">Msg</span>: <span class="nb">Clone</span><span class="o">&gt;</span><span class="p">(</span><span class="n">message</span>: <span class="nc">Msg</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Msg</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">log</span><span class="p">(</span><span class="n">message</span><span class="p">.</span><span class="n">clone</span><span class="p">());</span><span class="w"> </span><span class="c1">// &lt;-- keep this one
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">force_clone</span><span class="p">(</span><span class="o">&amp;</span><span class="n">message</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">force_clone</span><span class="o">&lt;</span><span class="n">Msg</span>: <span class="nb">Clone</span><span class="o">&gt;</span><span class="p">(</span><span class="n">message</span>: <span class="kp">&amp;</span><span class="nc">Msg</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Msg</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Here, the input is `&amp;Msg`, so the clone is necessary
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// to produce a `Msg`.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">message</span><span class="p">.</span><span class="n">clone</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="wont-the-last-use-transformation-change-behavior-by-making-destructors-run-earlier">Won&rsquo;t the last-use transformation change behavior by making destructors run earlier?</h3>
<p>Potentially, yes! Consider this example, written using <a href="https://smallcultfollowing.com/babysteps/
/blog/2025/10/22/explicit-capture-clauses/">explicit capture clause</a> notation and written assuming we add an <code>Alias</code> trait:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">process_and_stuff</span><span class="p">(</span><span class="n">tx</span>: <span class="nc">mpsc</span>::<span class="n">Sender</span><span class="o">&lt;</span><span class="n">Message</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">tokio</span>::<span class="n">spawn</span><span class="p">({</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">async</span><span class="w"> </span><span class="k">move</span><span class="p">(</span><span class="n">tx</span><span class="p">.</span><span class="n">alias</span><span class="p">())</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//     ---------- alias here
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">process</span><span class="p">(</span><span class="n">tx</span><span class="p">).</span><span class="k">await</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">do_something_unrelated</span><span class="p">().</span><span class="k">await</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The precise timing when <code>Sender</code> values are dropped can be important &ndash; when all senders have dropped, the <code>Receiver</code> will start returning <code>None</code> when you call <code>recv</code>. Before that, it will block waiting for more messages, since those <code>tx</code> handles could still be used.</p>
<p>So, in <code>process_and_stuff</code>, when will the sender aliases be fully dropped? The answer depends on whether we do the last-use transformation or not:</p>
<ul>
<li>Without the transformation, there are two aliases: the original <code>tx</code> and the one being held by the future. So the receiver will only start returning <code>None</code> when <code>do_something_unrelated</code> has finished <em>and</em> the task has completed.</li>
<li>With the transformation, the call to <code>tx.alias()</code> is removed, and so there is only one alias &ndash; <code>tx</code>, which is moved into the future, and dropped once the spawned task completes. This could well be earlier than in the previous code, which had to wait until both <code>process_and_stuff</code> and the new task completed.</li>
</ul>
<p>Most of the time, running destructors earlier is a good thing. That means lower peak memory usage, faster responsiveness. But in extreme cases it could lead to bugs &ndash; a typical example is a <code>Mutex&lt;()&gt;</code> where the guard is being used to protect some external resource.</p>
<h3 id="how-can-we-change-when-code-runs-doesnt-that-break-stability">How can we change when code runs? Doesn&rsquo;t that break stability?</h3>
<p>This is what editions are for! We have in fact done a very similar transformation before, in Rust 2021. RFC 2229 changed destructor timing around closures and it was, by and large, a non-event.</p>
<p>The desire for edition compatibility is in fact one of the reasons I want to make this a <em>last-use transformation</em> and not some kind of <em>optimization</em>. There is no UB in any of these examples, it&rsquo;s just that to understand what Rust code does around clones/aliases is a bit more complex than it used to be, because the compiler will do automatic transformation to those calls. The fact that this transformation is local to a function means we can decide on a call-by-call basis whether it should follow the older edition rules (where it will always occur) or the newer rules (where it may be transformed into a move).</p>
<h3 id="does-that-mean-that-the-last-use-transformation-would-change-with-polonius-or-other-borrow-checker-improvements">Does that mean that the last-use transformation would change with Polonius or other borrow checker improvements?</h3>
<p>In theory, yes, improvements to borrow-checker precision like Polonius could mean that we identify more opportunities to apply the last-use transformation. This is something we can phase in over an edition. It&rsquo;s a bit of a pain, but I think we can live with it &ndash; and I&rsquo;m unconvinced it will be important in practice. For example, when thinking about the improvements I expect under Polonius, I was not able to come up with a realistic example that would be impacted.</p>
<h3 id="isnt-it-weird-to-do-this-after-borrow-check">Isn&rsquo;t it weird to do this after borrow check?</h3>
<p>This last-use transformation is guaranteed not to produce code that would fail the borrow check. However, it can affect the correctness of unsafe code:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">p</span>: <span class="o">*</span><span class="k">const</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;*</span><span class="n">some_place</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">q</span>: <span class="nc">T</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">some_place</span><span class="p">.</span><span class="n">clone</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">//         ---------- assuming `some_place` is
</span></span></span><span class="line"><span class="cl"><span class="c1">//         not used later, becomes a move
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">do_something</span><span class="p">(</span><span class="n">p</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//           -
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// This now refers to a stack slot
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// whose value is uninitialized.
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Note though that, in this case, there would be a lint identifying that the call to <code>some_place.clone()</code> will be transformed to just <code>some_place</code>. We could also detect simple examples like this one and report a stronger deny-by-default lint, as we often do when we see guaranteed UB.</p>
<h3 id="shouldnt-we-use-a-keyword-for-this">Shouldn&rsquo;t we use a keyword for this?</h3>
<p>When I originally had this idea, I called it &ldquo;use-use-everywhere&rdquo; and, instead of writing <code>x.clone()</code> or <code>x.alias()</code>, I imagined writing <code>x.use</code>. This made sense to me because a keyword seemed like a stronger signal that this was impacting closure desugaring. However, I&rsquo;ve changed my mind for a few reasons.</p>
<p>First, Santiago Pastorino gave strong pushback that <code>x.use</code> was going to be a stumbling block for new learners. They now have to see this keyword and try to understand what it means &ndash; in contrast, if they see method calls, they will likely not even notice something strange is going on.</p>
<p>The second reason though was TC who argued, in the lang-team meeting, that all the arguments for why it should be ergonomic to clone a ref-counted value in a closure applied equally well to <code>clone</code>, depending on the needs of your application. I completely agree. As I mentioned earlier, this also [addresses the concern I&rsquo;ve heard with the <code>Alias</code> trait], which is that there are things you want to ergonomically clone but which don&rsquo;t correspond to &ldquo;aliases&rdquo;. True.</p>
<p>In general I think that <code>clone</code> (and <code>alias</code>) are fundamental enough to how Rust is used that it&rsquo;s ok to special case them. Perhaps we&rsquo;ll identify other similar methods in the future, or generalize this mechanism, but for now I think we can focus on these two cases.</p>
<h3 id="what-about-deferred-ref-counting">What about &ldquo;deferred ref-counting&rdquo;?</h3>
<p>One point that I&rsquo;ve raised from time-to-time is that I would like a solution that gives the compiler more room to optimize ref-counting to avoid incrementing ref-counts in cases where it is obvious that those ref-counts are not needed. An example might be a function like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">use_data</span><span class="p">(</span><span class="n">rc</span>: <span class="nc">Rc</span><span class="o">&lt;</span><span class="n">Data</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="w"> </span><span class="n">datum</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="n">rc</span><span class="p">.</span><span class="n">iter</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;</span><span class="si">{datum:?}</span><span class="s">&#34;</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This function requires ownership of an alias to a ref-counted value but it doesn&rsquo;t actually <em>do</em> anything but read from it. A caller like this one&hellip;</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">use_data</span><span class="p">(</span><span class="n">source</span><span class="p">.</span><span class="n">alias</span><span class="p">())</span><span class="w">
</span></span></span></code></pre></div><p>&hellip;doesn&rsquo;t really <em>need</em> to increment the reference count, since the caller will be holding a reference the entire time. I often write code like this using a <code>&amp;</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">use_data</span><span class="p">(</span><span class="n">rc</span>: <span class="kp">&amp;</span><span class="nc">Rc</span><span class="o">&lt;</span><span class="n">Data</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="w"> </span><span class="n">datum</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="n">rc</span><span class="p">.</span><span class="n">iter</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;</span><span class="si">{datum:?}</span><span class="s">&#34;</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>so that the caller can do <code>use_data(&amp;source)</code> &ndash; this then allows the callee to write <code>rc.alias()</code> in the case that it <em>wants</em> to take ownership.</p>
<p>I&rsquo;ve basically decided to punt on adressing this problem. I think folks that are very performance sensitive can use <code>&amp;Arc</code> and the rest of us can sometimes have an extra ref-count increment, but either way, the semantics for users are clear enough and (frankly) good enough.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Surprisingly to me, <code>clippy::pedantic</code> doesn&rsquo;t have a dedicated lint for unnecessary clones. This <a href="https://play.rust-lang.org/?version=stable&amp;mode=debug&amp;edition=2024&amp;gist=1b170aea4b8dfb879bd5ec2ffb4135b6">particular example</a> does get a lint, but it&rsquo;s a lint about taking an argument by value and then not consuming it. If you rewrite the example to create <code>id</code> locally, <a href="https://play.rust-lang.org/?version=stable&amp;mode=debug&amp;edition=2024&amp;gist=3a6fcf9639114b5e44f5d68b06feee13">clippy does not complain</a>.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/ergonomic-rc" term="ergonomic-rc" label="Ergonomic RC"/></entry><entry><title type="html">Bikeshedding `Handle` and other follow-up thoughts</title><link href="https://smallcultfollowing.com/babysteps/blog/2025/11/05/bikeshedding-handle/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2025/11/05/bikeshedding-handle/</id><published>2025-11-05T00:00:00+00:00</published><updated>2025-11-05T08:15:38-05:00</updated><content type="html"><![CDATA[<p>There have been two major sets of responses to my proposal for a <code>Handle</code> trait. The first is that the <code>Handle</code> trait seems useful but doesn&rsquo;t over all the cases where one would like to be able to ergonomically clone things. The second is that the name doesn&rsquo;t seem to fit with our Rust conventions for trait names, which emphasize short verbs over nouns. The TL;DR of my response is that (1) I agree, this is why I think we should work to make <code>Clone</code> ergonomic as well as <code>Handle</code>; and (2) I agree with that too, which is why I think we should find another name. At the moment I prefer <code>Share</code>, with <code>Alias</code> coming in second.</p>
<h2 id="handle-doesnt-cover-everything">Handle doesn&rsquo;t cover everything</h2>
<p>The first concern with the <code>Handle</code> trait is that, while it gives a clear semantic basis for when to implement the trait, it does not cover all the cases where calling <code>clone</code> is annoying. In other words, if we opt to use <code>Handle</code>, and then we make creating new handles very ergonomic, but calling <code>clone</code> remains painful, there will be a temptation to use the <code>Handle</code> when it is not appropriate.</p>
<p>In one of our lang team design meetings, TC raised the point that, for many applications, even an &ldquo;expensive&rdquo; clone isn&rsquo;t really a big deal. For example, when writing CLI tools and things, I regularly clone strings and vectors of strings and hashmaps and whatever else; I could put them in an Rc or Arc but I know it just doens&rsquo;t matter.</p>
<p>My solution here is simple: let&rsquo;s make solutions that apply to both <code>Clone</code> and <code>Handle</code>. Given that I think we need a proposal that allows for handles that are <em>both</em> ergonomic <em>and</em> explicit, it&rsquo;s not hard to say that we should extend that solution to include the option for clone.</p>
<p>The <a href="https://smallcultfollowing.com/babysteps/
/blog/2025/10/22/explicit-capture-clauses.html">explicit capture clause</a> post already fits this design. I explicitly chose a design that allowed for users to write <code>move(a.b.c.clone())</code> or <code>move(a.b.c.handle())</code>, and hence works equally well (or equally not well&hellip;) with both traits</p>
<h2 id="the-name-handle-doesnt-fit-the-rust-conventions">The name <code>Handle</code> doesn&rsquo;t fit the Rust conventions</h2>
<p>A number of people have pointed out <code>Handle</code> doesn&rsquo;t fit the Rust naming conventions for traits like this, which aim for short verbs. You can interpret <code>handle</code> as a verb, but it doesn&rsquo;t mean what we want. Fair enough. I like the name <code>Handle</code> because it gives a <em>noun</em> we can use to talk about, well, <em>handles</em>, but I agree that the trait name doesn&rsquo;t seem right. There was a lot of bikeshedding on possible options but I think I&rsquo;ve come back to preferring Jack Huey&rsquo;s original proposal, <code>Share</code> (with a method <code>share</code>). I think <code>Alias</code> and <code>alias</code> is my second favorite. Both of them are short, relatively common verbs.</p>
<p>I originally felt that <code>Share</code> was a bit too generic and overly associated with sharing across threads &ndash; but then I at least always call <code>&amp;T</code> a <em>shared reference</em><sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>, and an <code>&amp;T</code> would implement <code>Share</code>, so it all seems to work well. Hat tip to Ariel Ben-Yehuda for pushing me on this particular name.</p>
<h2 id="coming-up-next">Coming up next</h2>
<p>The flurry of posts in this series have been an attempt to survey all the discussions that have taken place in this area. I&rsquo;m not yet aiming to write a final proposal &ndash; I think what will come out of this is a series of multiple RFCs.</p>
<p>My current feeling is that we should add the <code>Hand^H^H^H^H</code>, uh, <code>Share</code> trait. I also think we should add <a href="https://smallcultfollowing.com/babysteps/
/blog/2025/10/22/explicit-capture-clauses.html">explicit capture clauses</a>. However, while explicit capture clauses are clearly &ldquo;low-level enough for a kernel&rdquo;, I don&rsquo;t really think they are &ldquo;usable enough for a GUI&rdquo; . The next post will explore another idea that I think might bring us closer to that ultimate <a href="https://smallcultfollowing.com/babysteps/
/blog/2025/10/13/ergonomic-explicit-handles.html">ergonomic and explicit</a> goal.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>A lot of people say <em>immutable reference</em> but that is simply accurate: an <code>&amp;Mutex</code> is not immutable. I think that the term shared reference is better.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/ergonomic-rc" term="ergonomic-rc" label="Ergonomic RC"/></entry><entry><title type="html">But then again...maybe alias?</title><link href="https://smallcultfollowing.com/babysteps/blog/2025/11/05/maybe-alias/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2025/11/05/maybe-alias/</id><published>2025-11-05T00:00:00+00:00</published><updated>2025-11-05T08:57:20-05:00</updated><content type="html"><![CDATA[<p>Hmm, as I re-read the post I literally <em>just</em> posted a few minutes ago, I got to thinking. Maybe the right name is indeed <code>Alias</code>, and not <code>Share</code>. The rationale is simple: alias can serve as both a noun and a verb. It hits that sweet spot of &ldquo;common enough you know what it means, but weird enough that it can be Rust Jargon for something quite specific&rdquo;. In the same way that we talk about &ldquo;passing a clone of <code>foo</code>&rdquo; we can talk about &ldquo;passing an alias to <code>foo</code>&rdquo; or an &ldquo;alias of <code>foo</code>&rdquo;. Food for thought! I&rsquo;m going to try <code>Alias</code> on for size in future posts and see how it feels.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/ergonomic-rc" term="ergonomic-rc" label="Ergonomic RC"/></entry><entry><title type="html">Explicit capture clauses</title><link href="https://smallcultfollowing.com/babysteps/blog/2025/10/22/explicit-capture-clauses/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2025/10/22/explicit-capture-clauses/</id><published>2025-10-22T00:00:00+00:00</published><updated>2025-10-22T06:08:27-04:00</updated><content type="html"><![CDATA[<p>In my previous post about Ergonomic Ref Counting, I talked about how, whatever else we do, we need a way to have explicit handle creation that is ergonomic. The next few posts are going to explore a few options for how we might do that.</p>
<p>This post focuses on <strong>explicit capture clauses</strong>, which would permit closures to be annotated with an explicit set of captured places. My take is that explicit capture clauses are a no brainer, for reasons that I&rsquo;ll cover below, and we should definitely do them; but they may not be enough to be considered <em>ergonomic</em>, so I&rsquo;ll explore more proposals afterwards.</p>
<h2 id="motivation">Motivation</h2>
<p>Rust closures today work quite well but I see a few problems:</p>
<ul>
<li>Teaching and understanding closure desugaring is difficult because it lacks an explicit form. Users have to learn to desugar in their heads to understand what&rsquo;s going on.</li>
<li>Capturing the &ldquo;clone&rdquo; of a value (or possibly other transformations) has no concise syntax.</li>
<li>For long closure bodies, it is hard to determine precisely which values are captured and how; you have to search the closure body for references to external variables, account for shadowing, etc.</li>
<li>It is hard to develop an intuition for when <code>move</code> is required. I find myself adding it when the compiler tells me to, but that&rsquo;s annoying.</li>
</ul>
<h2 id="lets-look-at-a-strawperson-proposal">Let&rsquo;s look at a strawperson proposal</h2>
<p>Some time ago, I wrote a proposal for explicit capture clauses. I actually see a lot of flaws with this proposal, but I&rsquo;m still going to explain it: right now it&rsquo;s the only solid proposal I know of, and it&rsquo;s good enough to explain how an explicit capture clause <em>could be seen</em> as a solution to the &ldquo;explicit <em>and</em> ergonomic&rdquo; goal. I&rsquo;ll then cover some of the things I like about the proposal and what I don&rsquo;t.</p>
<h2 id="begin-with-move">Begin with <code>move</code></h2>
<p>The proposal begins by extending the <code>move</code> keyword with a list of places to capture:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">closure</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">move</span><span class="p">(</span><span class="n">a</span><span class="p">.</span><span class="n">b</span><span class="p">.</span><span class="n">c</span><span class="p">,</span><span class="w"> </span><span class="n">x</span><span class="p">.</span><span class="n">y</span><span class="p">)</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">do_something</span><span class="p">(</span><span class="n">a</span><span class="p">.</span><span class="n">b</span><span class="p">.</span><span class="n">c</span><span class="p">.</span><span class="n">d</span><span class="p">,</span><span class="w"> </span><span class="n">x</span><span class="p">.</span><span class="n">y</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">};</span><span class="w">
</span></span></span></code></pre></div><p>The closure will then take ownership of those two places; references to those places in the closure body will be replaced by accesses to these captured fields. So that example would desugar to something like</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">closure</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">struct</span> <span class="nc">MyClosure</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">a_b_c</span>: <span class="nc">Foo</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">x_y</span>: <span class="nc">Bar</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">impl</span><span class="w"> </span><span class="nb">FnOnce</span><span class="o">&lt;</span><span class="p">()</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">MyClosure</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">fn</span> <span class="nf">call_once</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Baz</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">do_something</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">a_b_c</span><span class="p">.</span><span class="n">d</span><span class="p">,</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">x_y</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//           ----------    --------
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//   The place `a.b.c` is      |
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//   rewritten to the field    |
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//   `self.a_b_c`              |
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//                  Same here but for `x.y`
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">MyClosure</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">a_b_c</span>: <span class="nc">self</span><span class="p">.</span><span class="n">a</span><span class="p">.</span><span class="n">b</span><span class="p">.</span><span class="n">c</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">x_y</span>: <span class="nc">self</span><span class="p">.</span><span class="n">x</span><span class="p">.</span><span class="n">y</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">};</span><span class="w">
</span></span></span></code></pre></div><p>When using a simple list like this, attempts to reference other places that were not captured result in an error:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">closure</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">move</span><span class="p">(</span><span class="n">a</span><span class="p">.</span><span class="n">b</span><span class="p">.</span><span class="n">c</span><span class="p">,</span><span class="w"> </span><span class="n">x</span><span class="p">.</span><span class="n">y</span><span class="p">)</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">do_something</span><span class="p">(</span><span class="n">a</span><span class="p">.</span><span class="n">b</span><span class="p">.</span><span class="n">c</span><span class="p">.</span><span class="n">d</span><span class="p">,</span><span class="w"> </span><span class="n">x</span><span class="p">.</span><span class="n">z</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//           -------  ---
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//           OK       Error: `x.z` not captured
</span></span></span><span class="line"><span class="cl"><span class="p">};</span><span class="w">
</span></span></span></code></pre></div><h2 id="capturing-with-rewrites">Capturing with rewrites</h2>
<p>It is also possible to capture a custom expression by using an <code>=</code> sign. So for example, you could rewrite the above closure as follows:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">closure</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">move</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">a</span><span class="p">.</span><span class="n">b</span><span class="p">.</span><span class="n">c</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">a</span><span class="p">.</span><span class="n">b</span><span class="p">.</span><span class="n">c</span><span class="p">.</span><span class="n">clone</span><span class="p">(),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">x</span><span class="p">.</span><span class="n">y</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">)</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">do_something</span><span class="p">(</span><span class="n">a</span><span class="p">.</span><span class="n">b</span><span class="p">.</span><span class="n">c</span><span class="p">.</span><span class="n">d</span><span class="p">,</span><span class="w"> </span><span class="n">x</span><span class="p">.</span><span class="n">z</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">};</span><span class="w">
</span></span></span></code></pre></div><p>and it would desugar to:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">closure</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">struct</span> <span class="nc">MyClosure</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="cm">/* as before */</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">impl</span><span class="w"> </span><span class="nb">FnOnce</span><span class="o">&lt;</span><span class="p">()</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">MyClosure</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="cm">/* as before */</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">MyClosure</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">a_b_c</span>: <span class="nc">self</span><span class="p">.</span><span class="n">a</span><span class="p">.</span><span class="n">b</span><span class="p">.</span><span class="n">c</span><span class="p">.</span><span class="n">clone</span><span class="p">(),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//     ------------------
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">x_y</span>: <span class="nc">self</span><span class="p">.</span><span class="n">x</span><span class="p">.</span><span class="n">y</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">};</span><span class="w">
</span></span></span></code></pre></div><p>When using this form, the expression assigned to <code>a.b.c</code> must have the same type as <code>a.b.c</code> in the surrounding scope. So this would be an error:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">closure</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">move</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">a</span><span class="p">.</span><span class="n">b</span><span class="p">.</span><span class="n">c</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">22</span><span class="p">,</span><span class="w"> </span><span class="c1">// Error: `i32` is not `Foo`
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">x</span><span class="p">.</span><span class="n">y</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">)</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="cm">/* ... */</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">};</span><span class="w">
</span></span></span></code></pre></div><h2 id="shorthands-and-capturing-by-reference">Shorthands and capturing by reference</h2>
<p>You can understand <code>move(a.b)</code> as sugar for <code>move(a.b = a.b)</code>. We support other convenient shorthands too, such as</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">move</span><span class="p">(</span><span class="n">a</span><span class="p">.</span><span class="n">b</span><span class="p">.</span><span class="n">clone</span><span class="p">())</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="o">..</span><span class="p">.}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// == anything that ends in a method call becomes ==&gt;
</span></span></span><span class="line"><span class="cl"><span class="k">move</span><span class="p">(</span><span class="n">a</span><span class="p">.</span><span class="n">b</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">a</span><span class="p">.</span><span class="n">b</span><span class="p">.</span><span class="n">clone</span><span class="p">())</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="o">..</span><span class="p">.}</span><span class="w">
</span></span></span></code></pre></div><p>and two kinda special shorthands:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">move</span><span class="p">(</span><span class="o">&amp;</span><span class="n">a</span><span class="p">.</span><span class="n">b</span><span class="p">)</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">move</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">a</span><span class="p">.</span><span class="n">b</span><span class="p">)</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>These are special because the captured value is indeed <code>&amp;a.b</code> and <code>&amp;mut a.b</code> &ndash; but that by itself wouldn&rsquo;t work, because the type doesn&rsquo;t match. So we rewrite each access to <code>a.b</code> to desugar to a dereference of the <code>a_b</code> field, like <code>*self.a_b</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">move</span><span class="p">(</span><span class="o">&amp;</span><span class="n">a</span><span class="p">.</span><span class="n">b</span><span class="p">)</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">foo</span><span class="p">(</span><span class="n">a</span><span class="p">.</span><span class="n">b</span><span class="p">)</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// desugars to
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">MyStruct</span><span class="o">&lt;</span><span class="na">&#39;l</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">a_b</span>: <span class="kp">&amp;</span><span class="na">&#39;l</span> <span class="nc">Foo</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="nb">FnOnce</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">MyStruct</span><span class="o">&lt;</span><span class="nb">&#39;_</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">call_once</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">foo</span><span class="p">(</span><span class="o">*</span><span class="bp">self</span><span class="p">.</span><span class="n">a_b</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//  ---------
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//  we insert the `*` too
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">MyStruct</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">a_b</span>: <span class="kp">&amp;</span><span class="nc">a</span><span class="p">.</span><span class="n">b</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">move</span><span class="p">(</span><span class="o">&amp;</span><span class="n">a</span><span class="p">.</span><span class="n">b</span><span class="p">)</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">foo</span><span class="p">(</span><span class="o">*</span><span class="n">a</span><span class="p">.</span><span class="n">b</span><span class="p">)</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>There&rsquo;s a lot of precedence for this sort of transform: it&rsquo;s precisely what we do for the <code>Deref</code> trait and for existing closure captures.</p>
<h2 id="fresh-variables">Fresh variables</h2>
<p>We should also allow you to define fresh variables. These can have arbitrary types. The values are evaluated at closure creation time and stored in the closure metadata:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">move</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">load_data</span><span class="p">(),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">y</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">)</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">take</span><span class="p">(</span><span class="o">&amp;</span><span class="n">data</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="open-ended-captures">Open-ended captures</h2>
<p>All of our examples so far fully enumerated the captured variables. But Rust closures today infer the set of captures (and the style of capture) based on the paths that are used. We should permit that as well. I&rsquo;d permit that with a <code>..</code> sugar, so these two closures are equivalent:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">c2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">move</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="cm">/* closure */</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">//       ---- capture anything that is used,
</span></span></span><span class="line"><span class="cl"><span class="c1">//            taking ownership
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">c1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">move</span><span class="p">(</span><span class="o">..</span><span class="p">)</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="cm">/* closure */</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">//           ---- capture anything else that is used,
</span></span></span><span class="line"><span class="cl"><span class="c1">//                taking ownership
</span></span></span></code></pre></div><p>Of course you can combine:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">c</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">move</span><span class="p">(</span><span class="n">x</span><span class="p">.</span><span class="n">y</span><span class="p">.</span><span class="n">clone</span><span class="p">(),</span><span class="w"> </span><span class="o">..</span><span class="p">)</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">};</span><span class="w">
</span></span></span></code></pre></div><p>And you could write <code>ref</code> to get the equivalent of <code>||</code> closures:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">c2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="cm">/* closure */</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">//       -- capture anything that is used,
</span></span></span><span class="line"><span class="cl"><span class="c1">//          using references if possible
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">c1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">move</span><span class="p">(</span><span class="k">ref</span><span class="p">)</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="cm">/* closure */</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">//            --- capture anything else that is used,
</span></span></span><span class="line"><span class="cl"><span class="c1">//                using references if possible
</span></span></span></code></pre></div><p>This lets you</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">c</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">move</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">a</span><span class="p">.</span><span class="n">b</span><span class="p">.</span><span class="n">clone</span><span class="p">(),</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">c</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">ref</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">)</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">combine</span><span class="p">(</span><span class="o">&amp;</span><span class="n">a</span><span class="p">.</span><span class="n">b</span><span class="p">,</span><span class="w"> </span><span class="o">&amp;</span><span class="n">c</span><span class="p">,</span><span class="w"> </span><span class="o">&amp;</span><span class="n">z</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//       ---   -   -
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//        |    |   |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//        |    | This will be captured by reference
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//        |    | since it is used by reference
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//        |    | and is not explicitly named.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//        |    |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//        |   This will be captured by value
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//        |   since it is explicitly named.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//        |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// We will capture a clone of this because
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// the user wrote `a.b.clone()`
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="frequently-asked-questions">Frequently asked questions</h2>
<h3 id="how-does-this-help-with-our-motivation">How does this help with our motivation?</h3>
<p>Let&rsquo;s look at the motivations I named:</p>
<h4 id="teaching-and-understanding-closure-desugaring-is-difficult">Teaching and understanding closure desugaring is difficult</h4>
<p>There&rsquo;s a lot of syntax there, but it also gives you an explicit form that you can use to do explanations. To see what I mean, consider the difference between these two closures (<a href="">playground</a>).</p>
<p>The first closure uses <code>||</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">3</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">c_attached</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">j</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">std</span>::<span class="n">mem</span>::<span class="n">replace</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">i</span><span class="p">,</span><span class="w"> </span><span class="n">j</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>While the second closure uses <code>move</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">3</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">c_detached</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">move</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">j</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">std</span>::<span class="n">mem</span>::<span class="n">replace</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">i</span><span class="p">,</span><span class="w"> </span><span class="n">j</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">};</span><span class="w">
</span></span></span></code></pre></div><p>These are in fact pretty different, <a href="https://play.rust-lang.org/?version=stable&amp;mode=debug&amp;edition=2024&amp;gist=fec374e4055a99aa3dda9e66a5c03495">as you can see in this playground</a>. But why? Well, the first closure desugars to capture a reference:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">3</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">c_attached</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">move</span><span class="p">(</span><span class="o">&amp;</span><span class="n">i</span><span class="p">)</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="o">..</span><span class="p">.};</span><span class="w">
</span></span></span></code></pre></div><p>and the second captures by value:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">3</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">c_attached</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">move</span><span class="p">(</span><span class="n">i</span><span class="p">)</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="o">..</span><span class="p">.};</span><span class="w">
</span></span></span></code></pre></div><p>Before, to explain that, I had to resort to desugaring to structs.</p>
<h4 id="capturing-a-clone-is-painful">Capturing a clone is painful</h4>
<p>If you have a closure that wants to capture the clone of something today, you have to introduce a fresh variable. So something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">closure</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">move</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">begin_actor</span><span class="p">(</span><span class="n">data</span><span class="p">,</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">tx</span><span class="p">.</span><span class="n">clone</span><span class="p">())</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">};</span><span class="w">
</span></span></span></code></pre></div><p>becomes</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">closure</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">self_tx</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">tx</span><span class="p">.</span><span class="n">clone</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">move</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">begin_actor</span><span class="p">(</span><span class="n">data</span><span class="p">,</span><span class="w"> </span><span class="n">self_tx</span><span class="p">.</span><span class="n">clone</span><span class="p">())</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">};</span><span class="w">
</span></span></span></code></pre></div><p>This is awkward. Under this proposal, it&rsquo;s possible to point-wise replace specific items:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">closure</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">move</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">tx</span><span class="p">.</span><span class="n">clone</span><span class="p">(),</span><span class="w"> </span><span class="o">..</span><span class="p">)</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">begin_actor</span><span class="p">(</span><span class="n">data</span><span class="p">,</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">tx</span><span class="p">.</span><span class="n">clone</span><span class="p">())</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">};</span><span class="w">
</span></span></span></code></pre></div><h4 id="for-long-closure-bodies-it-is-hard-to-determine-precisely-which-values-are-captured-and-how">For long closure bodies, it is hard to determine precisely which values are captured and how</h4>
<p>Quick! What variables does this closure use from the environment?</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="p">.</span><span class="n">flat_map</span><span class="p">(</span><span class="k">move</span><span class="w"> </span><span class="o">|</span><span class="p">(</span><span class="n">severity</span><span class="p">,</span><span class="w"> </span><span class="n">lints</span><span class="p">)</span><span class="o">|</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">parse_tt_as_comma_sep_paths</span><span class="p">(</span><span class="n">lints</span><span class="p">,</span><span class="w"> </span><span class="n">edition</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">into_iter</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">flat_map</span><span class="p">(</span><span class="k">move</span><span class="w"> </span><span class="o">|</span><span class="n">lints</span><span class="o">|</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// Rejoin the idents with `::`, so we have no spaces in between.
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">lints</span><span class="p">.</span><span class="n">into_iter</span><span class="p">().</span><span class="n">map</span><span class="p">(</span><span class="k">move</span><span class="w"> </span><span class="o">|</span><span class="n">lint</span><span class="o">|</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="n">lint</span><span class="p">.</span><span class="n">segments</span><span class="p">().</span><span class="n">filter_map</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                    </span><span class="o">|</span><span class="n">segment</span><span class="o">|</span><span class="w"> </span><span class="n">segment</span><span class="p">.</span><span class="n">name_ref</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="p">).</span><span class="n">join</span><span class="p">(</span><span class="s">&#34;::&#34;</span><span class="p">).</span><span class="n">into</span><span class="p">(),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="n">severity</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">})</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">})</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">})</span><span class="w">
</span></span></span></code></pre></div><p>No idea? Me either. What about this one?</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="p">.</span><span class="n">flat_map</span><span class="p">(</span><span class="k">move</span><span class="p">(</span><span class="n">edition</span><span class="p">)</span><span class="w"> </span><span class="o">|</span><span class="p">(</span><span class="n">severity</span><span class="p">,</span><span class="w"> </span><span class="n">lints</span><span class="p">)</span><span class="o">|</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="cm">/* same as above */</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">})</span><span class="w">
</span></span></span></code></pre></div><p>Ah, pretty clear! I find that once a closure moves beyond a couple of lines, it can make a function kind of hard to read, because it&rsquo;s hard to tell what variables it may be accessing. I&rsquo;ve had functions where it&rsquo;s important to correctness for one reason or another that a particular closure only accesses a subset of the values around it, but I have no way to indicate that right now. Sometimes I make separate functions, but it&rsquo;d be nicer if I could annotate the closure&rsquo;s captures explicitly.</p>
<h4 id="it-is-hard-to-develop-an-intuition-for-when-move-is-required">It is hard to develop an intuition for when <code>move</code> is required</h4>
<p>Hmm, actually, I don&rsquo;t think this notation helps with that at all! More about this below.</p>
<p>Let me cover some of the questions you may have about this design.</p>
<h3 id="why-allow-the-capture-clause-to-specify-an-entire-place-like-abc">Why allow the &ldquo;capture clause&rdquo; to specify an entire place, like <code>a.b.c</code>?</h3>
<p>Today you can write closures that capture places, like <code>self.context</code> below:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">closure</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">move</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">send_data</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">context</span><span class="p">,</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">other_field</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">};</span><span class="w">
</span></span></span></code></pre></div><p>My goal was to be able to take such a closure and to add annotations that change how particular places are captured, without having to do deep rewrites in the body:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">closure</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">move</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">context</span><span class="p">.</span><span class="n">clone</span><span class="p">(),</span><span class="w"> </span><span class="o">..</span><span class="p">)</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//            --------------------------
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//            the only change
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">send_data</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">context</span><span class="p">,</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">other_field</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">};</span><span class="w">
</span></span></span></code></pre></div><p>This definitely adds some complexity, because it means we have to be able to &ldquo;remap&rdquo; a place like <code>a.b.c</code> that has multiple parts. But it makes the explicit capture syntax far more powerful and convenient.</p>
<h3 id="why-do-you-keep-the-type-the-same-for-places-like-abc">Why do you keep the type the same for places like <code>a.b.c</code>?</h3>
<p>I want to ensure that the type of <code>a.b.c</code> is the same wherever it is type-checked, it&rsquo;ll simplify the compiler somewhat and just generally makes it easier to move code into and out of a closure.</p>
<h3 id="why-the-move-keyword">Why the move keyword?</h3>
<p>Because it&rsquo;s there? To be honest, I don&rsquo;t like the choice of <code>move</code> because it&rsquo;s so <em>operational</em>. I think if I could go back, I would try to refashion our closures around two concepts</p>
<ul>
<li><em>Attached</em> closures (what we now call <code>||</code>) would <em>always</em> be tied to the enclosing stack frame. They&rsquo;d always have a lifetime even if they don&rsquo;t capture anything.</li>
<li><em>Detached</em> closures (what we now call <code>move ||</code>) would capture by-value, like <code>move</code> today.</li>
</ul>
<p>I think this would help to build up the intuition of &ldquo;use <code>detach ||</code> if you are going to return the closure from the current stack frame and use <code>||</code> otherwise&rdquo;.</p>
<h3 id="what-would-a-max-min-explicit-capture-proposal-look-like">What would a max-min explicit capture proposal look like?</h3>
<p>A maximally minimal explicit capture close proposal would probably <em>just</em> let you name specific variables and not &ldquo;subplaces&rdquo;:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">move</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">a_b_c</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">a</span><span class="p">.</span><span class="n">b</span><span class="p">.</span><span class="n">c</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">x_y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">x</span><span class="p">.</span><span class="n">y</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">)</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">*</span><span class="n">x_y</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">a_b_c</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>I think you can see though that this makes introducing an explicit form a lot less pleasant to use and hence isn&rsquo;t really going to do anything to support ergonomic RC.</p>
<h2 id="conclusion-explicit-closure-clauses-make-things-better-but-not-great">Conclusion: Explicit closure clauses make things better, but not great</h2>
<p>I think doing explicit capture clauses is a good idea &ndash; I generally think we should have explicit syntax for everything in Rust, for teaching and explanatory purposes if nothing else; I didn&rsquo;t always think this way, but it&rsquo;s something I&rsquo;ve come to appreciate over time.</p>
<p>I&rsquo;m not sold on this specific proposal &ndash; but I think working through it is useful, because it (a) gives you an idea of what the benefits would be and (b) gives you an idea of how much hidden complexity there is.</p>
<p>I think the proposal shows that adding explicit capture clauses goes <em>some</em> way towards making things explicit <em>and</em> ergonomic. Writing <code>move(a.b.c.clone())</code> is definitely better than having to create a new binding.</p>
<p>But for me, it&rsquo;s not really nice <em>enough</em>. It&rsquo;s still quite a mental distraction to have to find the start of the closure, insert the <code>a.b.c.clone()</code> call, and it makes the closure header very long and unwieldy. Particularly for short closures the overhead is very high.</p>
<p>This is why I&rsquo;d like to look into other options. Nonetheless, it&rsquo;s useful to have discussed a proposal for an explicit form: if nothing else, it&rsquo;ll be useful to explain the precise semantics of other proposals later on.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/ergonomic-rc" term="ergonomic-rc" label="Ergonomic RC"/></entry><entry><title type="html">Move, Destruct, Forget, and Rust</title><link href="https://smallcultfollowing.com/babysteps/blog/2025/10/21/move-destruct-leak/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2025/10/21/move-destruct-leak/</id><published>2025-10-21T00:00:00+00:00</published><updated>2025-10-21T21:45:02-04:00</updated><content type="html"><![CDATA[<p>This post presents a proposal to extend Rust to support a number of different kinds of destructors. This means we could async drop, but also prevent &ldquo;forgetting&rdquo; (leaking) values, enabling async scoped tasks that run in parallel à la rayon/libstd. We&rsquo;d also be able to have types whose &ldquo;destructors&rdquo; require arguments. This proposal &ndash; an evolution of <a href="https://smallcultfollowing.com/babysteps/
/blog/2023/03/16/must-move-types.html">&ldquo;must move&rdquo;</a> that I&rsquo;ll call &ldquo;controlled destruction&rdquo; &ndash; is, I think, needed for Rust to live up to its goal of giving safe versions of critical patterns in systems programming. As such, it is needed to complete the &ldquo;async dream&rdquo;, in which async Rust and sync Rust work roughly the same.</p>
<p>Nothing this good comes for free. The big catch of the proposal is that it introduces more &ldquo;core splits&rdquo; into Rust&rsquo;s types. I believe these splits are well motivated and reasonable &ndash; they reflect <em>inherent complexity</em>, in other words, but they are something we&rsquo;ll want to think carefully about nonetheless.</p>
<h2 id="summary">Summary</h2>
<p>The TL;DR of the proposal is that we should:</p>
<ul>
<li>Introduce a new &ldquo;default trait bound&rdquo; <code>Forget</code> and an associated trait hierarchy:
<ul>
<li><code>trait Forget: Drop</code>, representing values that can be forgotten</li>
<li><code>trait Destruct: Move</code>, representing values with a destructor</li>
<li><code>trait Move: Pointee</code>, representing values that can be moved</li>
<li><code>trait Pointee</code>, the base trait that represents <em>any value</em></li>
</ul>
</li>
<li>Use the &ldquo;opt-in to weaker defaults&rdquo; scheme proposed for sizedness by <a href="https://github.com/rust-lang/rfcs/pull/3729">RFC #3729 (Hierarchy of Sized Traits)</a>
<ul>
<li>So <code>fn foo&lt;T&gt;(t: T)</code> defaults to &ldquo;a <code>T</code> that can be forgotten/destructed/moved&rdquo;</li>
<li>And <code>fn foo&lt;T: Destruct&gt;(t: T)</code> means &ldquo;a <code>T</code> that can be destructed, but not necessarily forgotten&rdquo;</li>
<li>And <code>fn foo&lt;T: Move&gt;(t: T)</code> means &ldquo;a <code>T</code> that can be moved, but not necessarily forgotten&rdquo;</li>
<li>&hellip;and so forth.</li>
</ul>
</li>
<li>Integrate and enforce the new traits:
<ul>
<li>The bound on <code>std::mem::forget</code> will already require <code>Forget</code>, so that&rsquo;s good.</li>
<li>Borrow check can enforce that any dropped value must implement <code>Destruct</code>; in fact, we already do this to enforce <code>const Destruct</code> bounds in <code>const fn</code>.</li>
<li>Borrow check can be extended to require a <code>Move</code> bound on any moved value.</li>
</ul>
</li>
<li>Adjust the trait bound on closures (luckily this works out fairly nicely)</li>
</ul>
<h2 id="motivation">Motivation</h2>
<p>In a <a href="https://nikomatsakis.github.io/rust-latam-2019/#1">talk I gave some years back at Rust LATAM in Uruguay</a><sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>, I <a href="https://nikomatsakis.github.io/rust-latam-2019/#81">said this</a>:</p>
<ul>
<li>It&rsquo;s easy to <strong>expose</strong> a high-performance API.</li>
<li>But it&rsquo;s hard to <strong>help users control it</strong> &ndash; and this is what Rust&rsquo;s type system does.</li>
</ul>
<img src="https://smallcultfollowing.com/babysteps/
/assets/2025-movedestructleak/firespell.gif" alt="Person casting a firespell and burning themselves"/>
<p>Rust currently does a pretty good job with preventing parts of your program from interfering with one another, but we don&rsquo;t do as good a job when it comes to guaranteeing that cleaup happens<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>. We have destructors, of course, but they have two critical limitations:</p>
<ul>
<li>All destructors must meet the same signature, <code>fn drop(&amp;mut self)</code>, which isn&rsquo;t always adequate.</li>
<li>There is no way to guarantee a destructor once you give up ownership of a value.</li>
</ul>
<h3 id="making-it-concrete">Making it concrete.</h3>
<p>That motivation was fairly abstract, so let me give some concrete examples of things that tie back to this limitation:</p>
<ul>
<li>The ability to have <code>async</code> or <code>const</code> drop, both of which require a distinct drop signature.</li>
<li>The ability to have a &ldquo;drop&rdquo; operation that takes arguments, such as e.g. a message that must be sent, or a result code that must be provided before the program terminates.</li>
<li>The ability to have async scopes that can access the stack, which requires a way to guarantee that a parallel thread will be joined even in an async context.</li>
<li>The ability to integrate at maximum efficiency with WebAssembly async tasks, which require guaranteed cleanup.<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></li>
</ul>
<p>The goal of this post is to outline an approach that could solve all of the above problems and which is backwards compatible with Rust today.</p>
<h3 id="the-capabilities-of-value-disposal">The &ldquo;capabilities&rdquo; of value disposal</h3>
<p>The core problem is that Rust today assumes that every <code>Sized</code> value can be moved, dropped, and forgotten:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// Without knowing anything about `T` apart
</span></span></span><span class="line"><span class="cl"><span class="c1">// from the fact that it&#39;s `Sized`, we can...
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">demonstration</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="n">a</span>: <span class="nc">T</span><span class="p">,</span><span class="w"> </span><span class="n">b</span>: <span class="nc">T</span><span class="p">,</span><span class="w"> </span><span class="n">c</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// ...drop `a`, running its destructor immediately.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">std</span>::<span class="n">mem</span>::<span class="nb">drop</span><span class="p">(</span><span class="n">a</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// ...forget `b`, skipping its destructor
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">std</span>::<span class="n">mem</span>::<span class="n">forget</span><span class="p">(</span><span class="n">b</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// ...move `c` into `x`
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">c</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w"> </span><span class="c1">// ...and then have `x` get dropped automatically,
</span></span></span><span class="line"><span class="cl"><span class="c1">// as exit the block.
</span></span></span></code></pre></div><h3 id="destructors-are-like-opt-out-methods">Destructors are like &ldquo;opt-out methods&rdquo;</h3>
<p>The way I see, most methods are &ldquo;opt-in&rdquo; &ndash; they don&rsquo;t execute unless you call them. But destructors are different. They are effectively a method that runs by default &ndash; unless you opt-out, e.g., by calling <code>forget</code>. But the ability to opt-out means that they don&rsquo;t fundamentally add any power over regular methods, they just make for a more ergonomic API.</p>
<p>The implication is that the only way in Rust today to <em>guarantee</em> that a destructor will run is to retain ownership of the value. This can be important to unsafe code &ndash; APIs that permit scoped threads, for example, need to <em>guarantee</em> that those parallel threads will be joined before the function returns. The only way they have to do that is to use a closure which gives <code>&amp;</code>-borrowed access to a <code>scope</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">scope</span><span class="p">(</span><span class="o">|</span><span class="n">s</span><span class="o">|</span><span class="w"> </span><span class="o">..</span><span class="p">.)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">//     -  --- ...which ensures that this
</span></span></span><span class="line"><span class="cl"><span class="c1">//     |      fn body cannot &#34;forget&#34; it.
</span></span></span><span class="line"><span class="cl"><span class="c1">//     |  
</span></span></span><span class="line"><span class="cl"><span class="c1">// This value has type `&amp;Scope`... 
</span></span></span></code></pre></div><p>Because the API nevers gives up ownership of the scope, it can ensure that it is never &ldquo;forgotten&rdquo; and thus that its destructor runs.</p>
<p>The scoped thread approach works for sync code, but it doesn&rsquo;t work for async code. The problem is that async functions return a future, which is a value. Users can therefore decide to &ldquo;forget&rdquo; this value, just like any other value, and thus the destructor may never run.</p>
<h3 id="guaranteed-cleanup-is-common-in-systems-programming">Guaranteed cleanup is common in systems programming</h3>
<p>When you start poking around, you find that <em>guaranteed</em> destructors turn up quite a bit in systems programming. Scoped APIs in futures are one example, but DMA (direct memory access) is another. Many embedded devices have a mode where you begin a DMA transfer that causes memory to be written into memory asynchronously. But you need to ensure that this DMA is terminated <em>before</em> that memory is freed. If that memory is on your stack, that means you need a destructor that will either cancel or block until the DMA finishes.<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup></p>
<h2 id="so-what-can-we-do-about-it">So what can we do about it?</h2>
<p>This situation is very analogous to the challenge of revisiting the default <code>Sized</code> bound, and I think the same basic approach that I outlined in [this blog post][sized] will work.</p>
<p>The core of the idea is simple: have a &ldquo;special&rdquo; set of traits arranged in a hierarchy:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Forget</span>: <span class="nc">Destruct</span><span class="w"> </span><span class="p">{}</span><span class="w"> </span><span class="c1">// Can be &#34;forgotten&#34;
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Destruct</span>: <span class="nc">Move</span><span class="w"> </span><span class="p">{}</span><span class="w">   </span><span class="c1">// Can be &#34;destructed&#34; (dropped)
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Move</span>: <span class="nc">Pointee</span><span class="w"> </span><span class="p">{}</span><span class="w">    </span><span class="c1">// Can be &#34;moved&#34;
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Pointee</span><span class="w"> </span><span class="p">{}</span><span class="w">          </span><span class="c1">// Can be referenced by pointer
</span></span></span></code></pre></div><p>By default, generic parameters get a <code>Forget</code> bound, so <code>fn foo&lt;T&gt;()</code> is equivalent to <code>fn foo&lt;T: Forget&gt;()</code>. But if the parameter <em>opts in</em> to a weaker bound, then the default is suppressed, so <code>fn bar&lt;T: Destruct&gt;()</code> means that <code>T</code> is assumed by &ldquo;destructible&rdquo; but <em>not</em> forgettable. And <code>fn baz&lt;T: Move&gt;()</code> indicates that <code>T</code> can <em>only</em> be moved.</p>
<h2 id="impact-of-these-bounds">Impact of these bounds</h2>
<p>Let me explain briefly how these bounds would work.</p>
<h3 id="the-default-can-forget-drop-move-etc">The default can forget, drop, move etc</h3>
<p>Given a default type <code>T</code>, or one that writes <code>Forget</code> explicitly, the function can do anything that is possible today:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">just_forget</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">Forget</span><span class="o">&gt;</span><span class="p">(</span><span class="n">a</span>: <span class="nc">T</span><span class="p">,</span><span class="w"> </span><span class="n">b</span>: <span class="nc">T</span><span class="p">,</span><span class="w"> </span><span class="n">c</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//         --------- this bound is the default
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">std</span>::<span class="n">mem</span>::<span class="nb">drop</span><span class="p">(</span><span class="n">a</span><span class="p">);</span><span class="w">   </span><span class="c1">// OK
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">std</span>::<span class="n">mem</span>::<span class="n">forget</span><span class="p">(</span><span class="n">b</span><span class="p">);</span><span class="w"> </span><span class="c1">// OK
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">c</span><span class="p">;</span><span class="w">           </span><span class="c1">// OK
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="the-forget-function-requires-t-forget">The forget function requires <code>T: Forget</code></h3>
<p>The <code>std::mem::forget</code> function would require <code>T: Forget</code> as well:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">forget</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">Forget</span><span class="o">&gt;</span><span class="p">(</span><span class="n">value</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="cm">/* magic intrinsic */</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This means that if you have only <code>Destruct</code>, the function can only drop or move, it can&rsquo;t &ldquo;forget&rdquo;:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">just_destruct</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">Destruct</span><span class="o">&gt;</span><span class="p">(</span><span class="n">a</span>: <span class="nc">T</span><span class="p">,</span><span class="w"> </span><span class="n">b</span>: <span class="nc">T</span><span class="p">,</span><span class="w"> </span><span class="n">c</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//           -----------
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// This function only requests &#34;Destruct&#34; capability.
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">std</span>::<span class="n">mem</span>::<span class="nb">drop</span><span class="p">(</span><span class="n">a</span><span class="p">);</span><span class="w">   </span><span class="c1">// OK
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">std</span>::<span class="n">mem</span>::<span class="n">forget</span><span class="p">(</span><span class="n">b</span><span class="p">);</span><span class="w"> </span><span class="c1">// ERROR: `T: Forget` required
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">c</span><span class="p">;</span><span class="w">           </span><span class="c1">// OK
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="the-borrow-checker-would-require-dropped-values-implement-destruct">The borrow checker would require &ldquo;dropped&rdquo; values implement <code>Destruct</code></h3>
<p>We would modify the <code>drop</code> function to require only <code>T: Destruct</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">drop</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">Destruct</span><span class="o">&gt;</span><span class="p">(</span><span class="n">t</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{}</span><span class="w">
</span></span></span></code></pre></div><p>We would also extend the borrow checker so that when it sees a value being dropped (i.e., because it went out of scope), it would require the <code>Destruct</code> bound.</p>
<p>That means that if you have a value whose type is only <code>Move</code>, you cannot &ldquo;drop&rdquo; it:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">just_move</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">Move</span><span class="o">&gt;</span><span class="p">(</span><span class="n">a</span>: <span class="nc">T</span><span class="p">,</span><span class="w"> </span><span class="n">b</span>: <span class="nc">T</span><span class="p">,</span><span class="w"> </span><span class="n">c</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//           -----------
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// This function only requests &#34;Move&#34; capability.
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">std</span>::<span class="n">mem</span>::<span class="nb">drop</span><span class="p">(</span><span class="n">a</span><span class="p">);</span><span class="w">   </span><span class="c1">// ERROR: `T: Destruct` required
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">std</span>::<span class="n">mem</span>::<span class="n">forget</span><span class="p">(</span><span class="n">b</span><span class="p">);</span><span class="w"> </span><span class="c1">// ERROR: `T: Forget` required
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">c</span><span class="p">;</span><span class="w">           </span><span class="c1">// OK
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">                        </span><span class="c1">// ERROR: `x` is being dropped, but `T: Destruct`
</span></span></span></code></pre></div><p>This means that if you have only a <code>Move</code> bound, you <em>must</em> move anything you own if you want to return from the function. For example:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">return_ok</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">Move</span><span class="o">&gt;</span><span class="p">(</span><span class="n">a</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">T</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">a</span><span class="w"> </span><span class="c1">// OK
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>If you have a function that does not move, you&rsquo;ll get an error:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">return_err</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">Move</span><span class="o">&gt;</span><span class="p">(</span><span class="n">a</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">T</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w"> </span><span class="c1">// ERROR: `a` does not implement `Destruct`
</span></span></span></code></pre></div><p>It&rsquo;s worth pointing out that this will be annoying as all get out in the face of panics:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">return_err</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">Move</span><span class="o">&gt;</span><span class="p">(</span><span class="n">a</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">T</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// ERROR: If a panic occurs, `a` would be dropped, but `T` not implement `Destruct`
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">forbid_env_var</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">a</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">forbid_env_var</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">if</span><span class="w"> </span><span class="n">std</span>::<span class="n">env</span>::<span class="n">var</span><span class="p">(</span><span class="s">&#34;BAD&#34;</span><span class="p">).</span><span class="n">is_ok</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="fm">panic!</span><span class="p">(</span><span class="s">&#34;Uh oh: BAD cannot be set&#34;</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>I&rsquo;m ok with this, but it is going to put pressure on better ways to rule out panics statically.</p>
<h3 id="const-and-later-async-variants-of-destruct">Const (and later async) variants of <code>Destruct</code></h3>
<p>In fact, we are already doing something much like this destruct check for const functions. Right now if you have a const fn and you try to drop a value, you get an error:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">const</span><span class="w"> </span><span class="k">fn</span> <span class="nf">test</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="n">t</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w"> </span><span class="c1">// ERROR!
</span></span></span></code></pre></div><p>Compiling that gives you the error:</p>
<pre tabindex="0"><code>error[E0493]: destructor of `T` cannot be evaluated at compile-time
 --&gt; src/lib.rs:1:18
  |
1 | const fn test&lt;T&gt;(t: T) { }
  |                  ^       - value is dropped here
  |                  |
  |                  the destructor for this type cannot be evaluated in constant functions
</code></pre><p>This check is not presently taking place in borrow check but it could be.</p>
<h3 id="the-borrow-checker-would-require-moved-values-implement-move">The borrow checker would require &ldquo;moved&rdquo; values implement <code>Move</code></h3>
<p>The final part of the check would be requiring that &ldquo;moved&rdquo; values implement <code>Move</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">return_err</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">Pointee</span><span class="o">&gt;</span><span class="p">(</span><span class="n">a</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">T</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">a</span><span class="w"> </span><span class="c1">// ERROR: `a` does not implement `Move`
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>You might think that having types that are <code>!Move</code> would replace the need for pin, but this is not the case. A <em>pinned</em> value is one that can <em>never move again</em>, whereas a value that is not <code>Move</code> can never be moved in the first place &ndash; at least once it is stored into a place.</p>
<p>I&rsquo;m not sure if this part of the proposal makes sense, we could start by just having all types be <code>Move</code>, <code>Destruct</code>, or (the default) <code>Forget</code>.</p>
<h3 id="opting-out-from-forget-etc">Opting out from forget etc</h3>
<p>The other part of the proposal is that you should be able to explicit &ldquo;opt out&rdquo; from being forgettable, e.g. by doing</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">MyType</span><span class="w"> </span><span class="p">{}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Destruct</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">MyType</span><span class="w"> </span><span class="p">{}</span><span class="w">
</span></span></span></code></pre></div><p>Doing this will limit the generics that can accept your type, of course.</p>
<h3 id="associated-type-bounds">Associated type bounds</h3>
<p>The tough part with these &ldquo;default bound&rdquo; proposals is always associated type bounds. For backwards compatibility, we&rsquo;d have to default to <code>Forget</code> but a lot of associated types that exist in the wild today shouldn&rsquo;t really <em>require</em> <code>Forget</code>. For example a trait like <code>Add</code> should <em>really</em> just require <code>Move</code> for its return type:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Add</span><span class="o">&lt;</span><span class="n">Rhs</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">Self</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Output</span><span class="w"> </span><span class="cm">/* : Move */</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>I am basically not too worried about this. It&rsquo;s possible that we can weaken these bounds over time or through editions. Or, perhaps, add in some kind of edition-specific &ldquo;alias&rdquo; like</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Add2025</span><span class="o">&lt;</span><span class="n">Rhs</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">Self</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Output</span>: <span class="nc">Move</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>where <code>Add2025</code> is implemented for everything that implements <code>Add</code>.</p>
<p>I am not sure exactly how to manage it, but we&rsquo;ll figure it out &ndash; and in the meantime, most of the types that should not be forgettable are really just &ldquo;guard&rdquo; types that don&rsquo;t have to flow through quite so many places.</p>
<h4 id="associated-type-bounds-in-closures">Associated type bounds in closures</h4>
<p>The one place that I think it is <em>really imporatnt</em> that we weaken the associated type bounds is with closures&ndash; and, fortunately, that&rsquo;s a place we can get away with due to the way our &ldquo;closure trait bound&rdquo; syntax works. I feel like I wrote a post on this before, but I can&rsquo;t find it now, but the short version is that, today, when you write <code>F: Fn()</code>, that means that the closure must return <code>()</code>. If you write <code>F: Fn() -&gt; T</code>, then this type <code>T</code> must have been declared somewhere else, and so <code>T</code> will (independently from the associated type of the <code>Fn</code> trait) get a default <code>Forget</code> bound. So since the <code>Fn</code> associated type is not independently nameable in stable Rust, we can change its bounds, and code like this would continue to work unchanged:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="o">&lt;</span><span class="n">T</span><span class="p">,</span><span class="w"> </span><span class="n">F</span><span class="o">&gt;</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">F</span>: <span class="nb">Fn</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">T</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//         - `T: Forget` still holds by default
</span></span></span><span class="line"><span class="cl"><span class="p">{}</span><span class="w">
</span></span></span></code></pre></div><h2 id="frequently-asked-questions">Frequently asked questions</h2>
<h3 id="how-does-this-relate-to-the-recent-thread-on-internals">How does this relate to the recent thread on internals?</h3>
<p>Recently I was pointed at <a href="https://internals.rust-lang.org/t/pre-rfc-substructural-type-system/23614">this internals thread</a> for a &ldquo;substructural type system&rdquo; which likely has very similar capabilities. To be totally honest, though, I haven&rsquo;t had time to read and digest it yet! I had this blog post like 95% done though so I figured I&rsquo;d post it first and then go try and compare.</p>
<h3 id="what-would-it-mean-for-a-struct-to-opt-out-of-move-eg-by-being-only-pointee">What would it mean for a struct to opt out of <code>Move</code> (e.g., by being only <code>Pointee</code>)?</h3>
<p>So, the system as I described <em>would</em> allow for &lsquo;unmoveable&rsquo; types (i.e., a struct that opts out from everything and only permits <code>Pointee</code>), but such a struct would only really be something you could store in a static memory location. You couldn&rsquo;t put it on the stack because the stack must eventually get popped. And you couldn&rsquo;t move it from place to place because, well, it&rsquo;s immobile.</p>
<p>This seems like something that could be useful &ndash; e.g., to model &ldquo;video RAM&rdquo; or something that lives in a specific location in memory and cannot live anywhere else &ndash; but it&rsquo;s not a widespread need.</p>
<h3 id="how-would-you-handle-destructors-with-arguments">How would you handle destructors with arguments?</h3>
<p>I imagine something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Transaction</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="kt">u8</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="sd">/// Opt out from destruct
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Move</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Transaction</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Transaction</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// This is effectively a &#34;destructor&#34;
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">complete</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">,</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">connection</span>: <span class="nc">Connection</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">Transaction</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">data</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>With this setup, any function that owns a <code>Transaction</code> must eventually invoke <code>transaction.complete()</code>. This is because no values of this type can be dropped, so they must be moved.</p>
<h3 id="how-does-this-relate-to-async-drop">How does this relate to async drop?</h3>
<p>This setup provides attacks a key problem that has blocked async drop in my mind, which is that types that are &ldquo;async drop&rdquo; do not have to implement &ldquo;sync drop&rdquo;. This gives the type system the ability to prevent them from being dropped in sync code, then, and it would mean that they can only be dropped in async drop. But there&rsquo;s still lots of design work to be done there.</p>
<h3 id="why-is-the-trait-destruct-and-not-drop">Why is the trait <code>Destruct</code> and not <code>Drop</code>?</h3>
<p>This comes from the const generifs work. I don&rsquo;t love it. But there is a logic to it. Right now, when you drop a struct or other value, that actually does a whole sequence of things, only one of which is running any <code>Drop</code> impl &ndash; it also (for example) drops all the fields in the struct recursively, etc. The idea is that &ldquo;destruct&rdquo; refers to this whole sequence.</p>
<h3 id="how-hard-would-this-to-be-to-prototype">How hard would this to be to prototype?</h3>
<p>I&hellip;don&rsquo;t actually think it would be very hard. I&rsquo;ve thought somewhat about it and all of the changes seem pretty straightforward. I would be keen to support a <a href="https://lang-team.rust-lang.org/how_to/experiment.html">lang-team experiment</a> on this.</p>
<h3 id="does-this-mean-we-should-have-had-leak">Does this mean we should have had leak?</h3>
<p>The whole topic of destructors and leaks and so forth datesback to approximately Rust 1.0, when we discovered that, in fact, our abstraction for threads was unsound when combined with cyclic ref-counted boxes. Before that we hadn&rsquo;t fully internalized that destructors are &ldquo;opt-out methods&rdquo;. You can read <a href="https://smallcultfollowing.com/babysteps/
/blog/2015/04/29/on-reference-counting-and-leaks/">this blog post I wrote at the time</a>. At the time, the primary idea was to have some kind of <code>?Leak</code> bounds and it was tied to the idea of references (so that all <code>'static</code> data was assumed to be &ldquo;leakable&rdquo;, and hence something you could put into an <code>Rc</code>). I&hellip; mostly think we made the right call at the time. I think it&rsquo;s good that most of the ecosystem is interoperable and that <code>Rc</code> doesn&rsquo;t require <code>static</code> bounds, and certainly I think it&rsquo;s good that we moved to 1.0 with minimal disruption. In any case, though, I rather prefer this design to the ones that were under discussion at the time, in part because it also addresses the need for different kinds of destructors and for destructors with many arguments and so forth, which wasn&rsquo;t something we thought about then.</p>
<h3 id="isnt-it-confusing-to-have-these-magic-traits-that-opt-out-from-default-bounds">Isn&rsquo;t it confusing to have these &ldquo;magic&rdquo; traits that &ldquo;opt out&rdquo; from default bounds?</h3>
<p>I think that specifying the <em>bounds you want</em> is inherently better than today&rsquo;s <code>?</code> design, both because it&rsquo;s easier to understand and because it allows us to backwards compatibly add traits in between in ways that are not possible with the <code>?</code> design.</p>
<p>However, I do see that having <code>T: Move</code> mean that <code>T: Destruct</code> does not hold is subtle. I wonder if we should adopt some kind of sigil or convention on these traits, like <code>T: @Move</code> or something. I don&rsquo;t know! Something to consider.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>That was a great conference. Also, interestingly, this is one of my favorite of all my talks, but for some reason, I rarely reuse this material. I should change that.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Academics distinguish &ldquo;safety&rdquo; from &ldquo;liveness properties&rdquo;, where safety means &ldquo;bad things don&rsquo;t happen&rdquo; and &ldquo;liveness&rdquo; means &ldquo;good things eventually happen&rdquo;. Another way of saying this is that Rust&rsquo;s type system helps with a lot of safety properties but struggles with liveness properties.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Uh, citation needed. I know this is true but I can&rsquo;t find the relevant WebAssembly issue where it is discussed. Help, internet!&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>Really the DMA problem is the same as scoped threads. If you think about it, the embedded device writing to memory is basically the same as a parallel thread writing to memory.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/must-move" term="must-move" label="Must move"/></entry><entry><title type="html">We need (at least) ergonomic, explicit handles</title><link href="https://smallcultfollowing.com/babysteps/blog/2025/10/13/ergonomic-explicit-handles/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2025/10/13/ergonomic-explicit-handles/</id><published>2025-10-13T00:00:00+00:00</published><updated>2025-10-13T07:39:16-04:00</updated><content type="html"><![CDATA[<p>Continuing my discussion on Ergonomic RC, I want to focus on the core question: <strong>should users have to explicitly invoke handle/clone, or not?</strong> This whole &ldquo;Ergonomic RC&rdquo; work was originally proposed by <a href="https://dioxuslabs.com/">Dioxus</a> and their answer is simple: <strong>definitely not</strong>. For the kind of high-level GUI applications they are building, having to call <code>cx.handle()</code> to clone a ref-counted value is pure noise. For that matter, for a lot of Rust apps, even cloning a string or a vector is no big deal. On the other hand, for a lot of applications, the answer is <strong>definitely yes</strong> &ndash; knowing where handles are created can impact performance, memory usage, and even correctness (don&rsquo;t worry, I&rsquo;ll give examples later in the post). So how do we reconcile this?</p>
<p><strong>This blog argues that we should make it ergonomic to be explicit</strong>. This wasn&rsquo;t always my position, but after an impactful conversation with Josh Triplett, I&rsquo;ve come around. I think it aligns with what I once called the <a href="https://smallcultfollowing.com/babysteps//blog/2022/09/18/dyn-async-traits-part-8-the-soul-of-rust/">soul of Rust</a>: we want to be ergonomic, yes, but we want to be <strong>ergonomic while giving control</strong><sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>.</p>
<p>I like Tyler Mandry&rsquo;s <a href="https://tmandry.gitlab.io/blog/posts/the-main-thing/"><em>Clarity of purpose</em></a> contruction, <em>&ldquo;Great code brings only the important characteristics of your application to your attention&rdquo;</em>. The key point is that <em>there is great code in which cloning and handles are important characteristics</em>, so we need to make that code possible to express nicely. This is particularly true since Rust is one of the very few languages that really targets that kind of low-level, foundational code.</p>
<p><strong>This does not mean we cannot (later) support automatic clones and handles.</strong> It&rsquo;s inarguable that this would benefit clarity of purpose for a lot of Rust code. But I think we should focus <em>first</em> on the harder case, the case where explicitness is needed, and <strong>get that as nice as we can</strong>; then we can circle back and decide whether to also support something automatic. One of the questions for me, in fact, is whether we can get &ldquo;fully explicit&rdquo; to be <em>nice enough</em> that we don&rsquo;t really need the automatic version. There are benefits from having &ldquo;one Rust&rdquo;, where all code follows roughly the same patterns, where those patterns are perfect some of the time, and don&rsquo;t suck too bad<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> when they&rsquo;re overkill.</p>
<h2 id="rust-should-not-surprise-you-hat-tip-josh-triplett">&ldquo;Rust should not surprise you.&rdquo; (hat tip: Josh Triplett)</h2>
<p>I mentioned this blog post resulted from a long conversation with Josh Triplett<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>. The key phrase that stuck with me from that conversation was: <em>Rust should not surprise you</em>. The way I think of it is like this. Every programmer knows what its like to have a marathon debugging session &ndash; to sit and state at code for days and think, <em>but&hellip; how is this even POSSIBLE?</em> Those kind of bug hunts can end in a few different ways. Occasionally you uncover a deeply satisfying, subtle bug in your logic. More often, you find that you wrote <code>if foo</code> and not <code>if !foo</code>. And <em>occasionally</em> you find out that your language was doing something that you didn&rsquo;t expect. That some simple-looking code concealed a subltle, complex interaction. People often call this kind of a <em>footgun</em>.</p>
<p>Overall, Rust is <em>remarkably</em> good at avoiding footguns<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup>. And part of how we&rsquo;ve achieved that is by making sure that things you might need to know are visible &ndash; like, explicit in the source. Every time you see a Rust match, you don&rsquo;t have to ask yourself &ldquo;what cases might be missing here&rdquo; &ndash; the compiler guarantees you they are all there. And when you see a call to a Rust function, you don&rsquo;t have to ask yourself if it is fallible &ndash; you&rsquo;ll see a <code>?</code> if it is.<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup></p>
<h3 id="creating-a-handle-can-definitely-surprise-you">Creating a handle can definitely &ldquo;surprise&rdquo; you</h3>
<p>So I guess the question is: <em>would you ever have to know about a ref-count increment</em>? The trick part is that the answer here is application dependent. For some low-level applications, definitely yes: an atomic reference count is a measurable cost. To be honest, I would wager that the set of applications where this is true are vanishingly small. And even in those applications, Rust already improves on the state of the art by giving you the ability to choose between <code>Rc</code> and <code>Arc</code> <em>and then proving that you don&rsquo;t mess it up</em>.</p>
<p>But there are other reasons you might want to track reference counts, and those are less easy to dismiss. One of them is memory leaks. Rust, unlike GC&rsquo;d languages, has <em>deterministic destruction</em>. This is cool, because it means that you can leverage destructors to manage all kinds of resources, as Yehuda wrote about long ago in his classic ode-to-<a href="https://en.wikipedia.org/wiki/Resource_acquisition_is_initialization">RAII</a> entitled <a href="https://blog.skylight.io/rust-means-never-having-to-close-a-socket/">&ldquo;Rust means never having to close a socket&rdquo;</a>. But although the points where handles are created and destroyed is deterministic, the nature of reference-counting can make it much harder to predict when the underlying resource will actually get freed. And if those increments are not visible in your code, it is that much harder to track them down.</p>
<p>Just recently, I was debugging <a href="">Symposium</a>, which is written in Swift. Somehow I had two <code>IPCManager</code> instances when I only expected one, and each of them was responding to every IPC message, wreaking havoc. Poking around I found stray references floating around in some surprising places, which was causing the problem. Would this bug have still occurred if I had to write <code>.handle()</code> explicitly to increment the ref count? Definitely, yes. Would it have been easier to find after the fact? Also yes.<sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup></p>
<p>Josh gave me a similar example from <a href="https://docs.rs/bytes/latest/bytes/">the &ldquo;bytes&rdquo; crate</a>. A <a href="https://docs.rs/bytes/latest/bytes/struct.Bytes.html"><code>Bytes</code></a> type is a <a href="https://smallcultfollowing.com/babysteps/
/blog/2025/10/07/the-handle-trait/">handle</a> to a slice of some underlying memory buffer. When you clone that handle, it will keep the entire backing buffer around. Sometimes you might prefer to copy your slice out into a separate buffer so that the underlying buffer can be freed. It&rsquo;s not that hard for me to imagine trying to hunt down an errant handle that is keeping some large buffer alive and being very frustrated that I can&rsquo;t see explicitly in the where those handles are created.</p>
<p>A similar case occurs with APIs like like <code>Arc::get_mut</code><sup id="fnref:7"><a href="#fn:7" class="footnote-ref" role="doc-noteref">7</a></sup>. <code>get_mut</code> takes an <code>&amp;mut Arc&lt;T&gt;</code> and, if the ref-count is 1, returns an <code>&amp;mut T</code>. This lets you take a <em>shareable</em> handle that <em>you</em> know is not actually <em>being</em> shared and recover uniqueness. This kind of API is not frequently used &ndash; but when you need it, it&rsquo;s so nice it&rsquo;s there.</p>
<h2 id="what-i-love-about-rust-is-its-versatility-low-to-high-in-one-language-hat-tip-alex-crichton">&ldquo;What I love about Rust is its versatility: low to high in one language&rdquo; (hat tip: Alex Crichton)</h2>
<p>Entering the conversation with Josh, I was leaning towards a design where you had some form of automated cloning of handles and an allow-by-default lint that would let crates which <em>don&rsquo;t</em> want that turn it off. But Josh convinced me that there is a significant class of applications that want handle creation to be ergonomic AND visible (i.e., explicit in the source). Low-level network services and even things like Rust For Linux likely fit this description, but any Rust application that uses <code>get_mut</code> or <code>make_mut</code> might also.</p>
<p>And this reminded me of something Alex Crichton once said to me. Unlike the other quotes here, it wasn&rsquo;t in the context of ergonomic ref-counting, but rather when I was working on my first attempt at the <a href="https://smallcultfollowing.com/babysteps/blog/2021/09/08/rustacean-principles/">&ldquo;Rustacean Principles&rdquo;</a>. Alex was saying that he loved how Rust was great for low-level code but also worked well high-level stuff like CLI tools and simple scripts.</p>
<p>I feel like you can interpret Alex&rsquo;s quote in two ways, depending on what you choose to emphasize. You could hear it as, &ldquo;It&rsquo;s important that Rust is good for high-level use cases&rdquo;. That is true, and it is what leads us to ask whether we should even make handles visible at all.</p>
<p>But you can also read Alex&rsquo;s quote as, &ldquo;It&rsquo;s important that there&rsquo;s one language that works well enough for <em>both</em>&rdquo; &ndash; and I think that&rsquo;s true too. The &ldquo;true Rust gestalt&rdquo; is when we manage to <em>simultaneously</em> give you the low-level control that grungy code needs but wrapped in a high-level package. This is the promise of zero-cost abstractions, of course, and Rust (in its best moments) delivers.</p>
<h3 id="the-soul-of-rust-low-level-enough-for-a-kernel-usable-enough-for-a-gui">The &ldquo;soul of Rust&rdquo;: low-level enough for a kernel, usable enough for a GUI</h3>
<p>Let&rsquo;s be honest. High-level GUI programming is not Rust&rsquo;s bread-and-butter, and it never will be; users will never confuse Rust for TypeScript. But then, TypeScript will never be in the Linux kernel.</p>
<p>The goal of Rust is to be a single language that can, by and large, be &ldquo;good enough&rdquo; for <em>both</em> extremes. <strong>The goal is make enough low-level details visible for kernel hackers but do so in a way that is usable enough for a GUI.</strong> It ain&rsquo;t easy, but it&rsquo;s the job.</p>
<p>This isn&rsquo;t the first time that Josh has pulled me back to this realization. The last time was in the context of async fn in dyn traits, and it led to a blog post talking about the <a href="https://smallcultfollowing.com/babysteps/blog/2022/09/18/dyn-async-traits-part-8-the-soul-of-rust/">&ldquo;soul of Rust&rdquo;</a> and a <a href="https://smallcultfollowing.com/babysteps/blog/2022/09/19/what-i-meant-by-the-soul-of-rust/">followup going into greater detail</a>. I think the catchphrase &ldquo;low-level enough for a Kernel, usable enough for a GUI&rdquo; kind of captures it.</p>
<h3 id="conclusion-explicit-handles-should-be-the-first-step-but-it-doesnt-have-to-be-the-final-step">Conclusion: Explicit handles should be the first step, but it doesn&rsquo;t have to be the final step</h3>
<p>There is a slight caveat I want to add. I think another part of Rust&rsquo;s soul is <em>preferring nuance to artificial simplicity</em> (&ldquo;as simple as possible, but no simpler&rdquo;, as they say). And I think the reality is that there&rsquo;s a huge set of applications that make new handles left-and-right (particularly but not exclusively in async land<sup id="fnref:8"><a href="#fn:8" class="footnote-ref" role="doc-noteref">8</a></sup>) and where explicitly creating new handles is noise, not signal. This is why e.g. Swift<sup id="fnref:9"><a href="#fn:9" class="footnote-ref" role="doc-noteref">9</a></sup> makes ref-count increments invisible &ndash; and they get a big lift out of that!<sup id="fnref:10"><a href="#fn:10" class="footnote-ref" role="doc-noteref">10</a></sup> I&rsquo;d wager most Swift users don&rsquo;t even realize that Swift is not garbage-collected<sup id="fnref:11"><a href="#fn:11" class="footnote-ref" role="doc-noteref">11</a></sup>.</p>
<p>But the key thing here is that even if we do add some way to make handle creation automatic, we ALSO want a mode where it is explicit and visible. So we might as well do that one first.</p>
<p>OK, I think I&rsquo;ve made this point 3 ways from Sunday now, so I&rsquo;ll stop. The next few blog posts in the series will dive into (at least) two options for how we might make handle creation and closures more ergonomic while retaining explicitness.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>I see a potential candidate for a design axiom&hellip; <em>rubs hands with an evil-sounding cackle and a look of glee</em>&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p><a href="https://youtu.be/JMFS9lrVd64?si=BdaDNm7rIueS0Jlx&amp;t=71">It&rsquo;s an industry term</a>.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Actually, by the standards of the conversations Josh and I often have, it was&rsquo;t really all that long &ndash; an hour at most.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>Well, at least <em>sync</em> Rust is. I think async Rust has more than its share, particularly around cancellation, but that&rsquo;s a topic for another blog post.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>Modulo panics, of course &ndash; and no surprise that accounting for panics is a major pain point for some Rust users.&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p>In this particular case, it was fairly easy for me to find regardless, but this application is very simple. I can definitely imagine ripgrep&rsquo;ing around a codebase to find all increments being useful, and that would be much harder to do without an explicit signal they are occurring.&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:7">
<p>Or <code>Arc::make_mut</code>, which is one of my favorite APIs. It takes an <code>Arc&lt;_&gt;</code> and gives you back mutable (i.e., unique) access to the internals, always! How is that possible, given that the ref count may not be 1? Answer: if the ref-count is not 1, then it clones it. This is perfect for copy-on-write-style code. So beautiful. 😍&#160;<a href="#fnref:7" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:8">
<p>My experience is that, due to language limitations we really should fix, many async constructs force you into <code>'static</code> bounds which in turn force you into <code>Rc</code> and <code>Arc</code> where you&rsquo;d otherwise have been able to use <code>&amp;</code>.&#160;<a href="#fnref:8" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:9">
<p>I&rsquo;ve been writing more Swift and digging it. I have to say, I love how they are not afraid to &ldquo;go big&rdquo;. I admire the ambition I see in designs like SwiftUI and their approach to async. I don&rsquo;t think they bat 100, but it&rsquo;s cool they&rsquo;re swinging for the stands. I want Rust to <a href="https://smallcultfollowing.com/babysteps/
/blog/2022/02/09/dare-to-ask-for-more-rust2024/">dare to ask for more</a>!&#160;<a href="#fnref:9" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:10">
<p>Well, not <em>only</em> that. They also allow class fields to be assigned when aliased which, to avoid stale references and iterator invalidation, means you have to move everything into ref-counted boxes and adopt persistent collections, which in turn comes at a performance cost and makes Swift a harder sell for lower-level foundational systems (though by no means a non-starter, in my opinion).&#160;<a href="#fnref:10" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:11">
<p>Though I&rsquo;d also wager that many eventually find themselves scratching their heads about a ref-count cycle. I&rsquo;ve not dug into how Swift handles those, but I see references to &ldquo;weak handles&rdquo; flying around, so I assume they&rsquo;ve not (yet?) adopted a cycle collector. To be clear, you can get a ref-count cycle in Rust too! It&rsquo;s harder to do since we discourage interior mutability, but not that hard.&#160;<a href="#fnref:11" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/ergonomic-rc" term="ergonomic-rc" label="Ergonomic RC"/></entry><entry><title type="html">SymmACP: extending Zed's ACP to support Composable Agents</title><link href="https://smallcultfollowing.com/babysteps/blog/2025/10/08/symmacp/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2025/10/08/symmacp/</id><published>2025-10-08T00:00:00+00:00</published><updated>2025-10-08T04:54:08-04:00</updated><content type="html"><![CDATA[<p>This post describes <strong>SymmACP</strong> &ndash; a proposed extension to Zed&rsquo;s <a href="https://agentclientprotocol.com/overview/introduction">Agent Client Protocol</a> that lets you build AI tools like Unix pipes or browser extensions. Want a better TUI? Found some cool slash commands on GitHub? Prefer a different backend? With SymmACP, you can mix and match these pieces and have them all work together without knowing about each other.</p>
<p>This is pretty different from how AI tools work today, where everything is a monolith &ndash; if you want to change one piece, you&rsquo;re stuck rebuilding the whole thing from scratch. SymmACP allows you to build out new features and modes of interactions in a layered, interoperable way. This post explains how SymmACP would work by walking through a series of examples.</p>
<p>Right now, SymmACP is just a thought experiment. I&rsquo;ve sketched these ideas to the Zed folks, and they seemed interested, but we still have to discuss the details in this post. My plan is to start prototyping in <a href="https://symposium-dev.github.io/symposium/">Symposium</a> &ndash; if you think the ideas I&rsquo;m discussing here are exciting, please join the <a href="https://symposium-dev.zulipchat.com/">Symposium Zulip</a> and let&rsquo;s talk!</p>
<h2 id="composable-agents-let-you-build-features-independently-and-then-combine-them">&ldquo;Composable agents&rdquo; let you build features independently and then combine them</h2>
<p>I&rsquo;m going to explain the idea of &ldquo;composable agents&rdquo; by walking through a series of features. We&rsquo;ll start with a basic CLI agent<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> tool &ndash; basically a chat loop with access to some MCP servers so that it can read/write files and execute bash commands. Then we&rsquo;ll show how you could add several features on top:</p>
<ol>
<li>Addressing time-blindness by helping the agent know what time it is.</li>
<li>Injecting context and &ldquo;personality&rdquo; to the agent.</li>
<li>Spawning long-running, asynchronous tasks.</li>
<li>A copy of Q CLI&rsquo;s <code>/tangent</code> mode that lets you do a bit of &ldquo;off the books&rdquo; work that gets removed from your history later.</li>
<li>Implementing <a href="https://symposium-dev.github.io/symposium/get-started/walkthroughs.html">Symposium&rsquo;s interactive walkthroughs</a>, which give the agent a richer vocabulary for communicating with you than just text.</li>
<li>Smarter tool delegation.</li>
</ol>
<p><strong>The magic trick is that each of these features will be developed as separate repositories.</strong> What&rsquo;s more, they could be applied to any base tool you want, so long as it speaks SymmACP. And you could also combine them with different front-ends, such as a TUI, a web front-end, builtin support from <a href="https://zed.dev/">Zed</a> or <a href="https://zed.dev/blog/jetbrains-on-acp">IntelliJ</a>, etc. Pretty neat.</p>
<p>My hope is that if we can centralize on SymmACP, or something like it, then we could move from everybody developing their own bespoke tools to an interoperable ecosystem of ideas that can build off of one another.</p>
<h2 id="let-mut-symmacp--acp">let mut SymmACP = ACP</h2>
<p>SymmACP begins with ACP, so let&rsquo;s explain what ACP is. ACP is a wonderfully simple protocol that lets you abstract over CLI agents. Imagine if you were using an agentic CLI tool except that, instead of communication over the terminal, the CLI tool communicates with a front-end over JSON-RPC messages, currently sent via stdin/stdout.</p>
<pre class="mermaid">flowchart LR
    Editor <-.->|JSON-RPC via stdin/stdout| Agent[CLI Agent]
  </pre>
<p>When you type something into the GUI, the editor sends a JSON-RPC message to the agent with what you typed. The agent responds with a stream of messages containing text and images. If the agent decides to invoke a tool, it can request permission by sending a JSON-RPC message back to the editor. And when the agent has completed, it responds to the editor with an &ldquo;end turn&rdquo; message that says &ldquo;I&rsquo;m ready for you to type something else now&rdquo;.</p>
<pre class="mermaid">sequenceDiagram
    participant E as Editor
    participant A as Agent
    participant T as Tool (MCP)
    
    E->>A: prompt("Help me debug this code")
    A->>E: request_permission("Read file main.rs")
    E->>A: permission_granted
    A->>T: read_file("main.rs")
    T->>A: file_contents
    A->>E: text_chunk("I can see the issue...")
    A->>E: text_chunk("The problem is on line 42...")
    A->>E: end_turn
  </pre>
<h2 id="telling-the-agent-what-time-it-is">Telling the agent what time it is</h2>
<p>OK, let&rsquo;s tackle our first feature. If you&rsquo;ve used a CLI agent, you may have noticed that they don&rsquo;t know what time it is &ndash; or even what <em>year</em> it is. This may sound trivial, but it can lead to some real mistakes. For example, they may not realize that some information is outdated. Or when they do web searches for information, they can search for the wrong thing: I&rsquo;ve seen CLI agents search the web for &ldquo;API updates in 2024&rdquo; for example, even though it is 2025.</p>
<p>To fix this, many CLI agents will inject some extra text along with your prompt, something like <code>&lt;current-date date=&quot;2025-10-08&quot; time=&quot;HH:MM:SS&quot;/&gt;</code>. This gives the LLM the context it needs.</p>
<p>So how could use ACP to build that? The idea is to create a <strong>proxy</strong>. This proxy would wrap the original ACP server:</p>
<pre class="mermaid">flowchart LR
    Editor[Editor/VSCode] <-->|ACP| Proxy[Datetime Proxy] <-->|ACP| Agent[CLI Agent]
  </pre>
<p>This proxy will take every &ldquo;prompt&rdquo; message it receives and decorate it with the date and time:</p>
<pre class="mermaid">sequenceDiagram
    participant E as Editor
    participant P as Proxy
    participant A as Agent
    
    E->>P: prompt("What day is it?")
    P->>A: prompt("&lt;current-date .../&gt; What day is it?")
    A->>P: text_chunk("It is 2025-10-08.")
    P->>E: text_chunk("It is 2025-10-08.")
    A->>P: end_turn
    P->>E: end_turn
  </pre>
<p>Simple, right? And of course this can be used with any editor and any ACP-speaking tool.</p>
<h2 id="next-feature-injecting-personality-to-the-agent">Next feature: Injecting &ldquo;personality&rdquo; to the agent</h2>
<p>Let&rsquo;s look at another feature that basically &ldquo;falls out&rdquo; from ACP: injecting personality. Most agents give you the ability to configure &ldquo;context&rdquo; in various ways &ndash; or what Claude Code calls <a href="https://docs.claude.com/en/docs/claude-code/memory">memory</a>. This is useful, but I and others have noticed that if what you want is to change how Claude &ldquo;behaves&rdquo; &ndash; i.e., to make it more collaborative &ndash; it&rsquo;s not really enough. You really need to kick off the conversation by reinforcing that pattern.</p>
<p>In Symposium, <a href="https://github.com/symposium-dev/symposium/blob/7f437fdf02ab52cd0bd3070d25feaad387b6d23f/symposium/mcp-server/src/server.rs#L885">the &ldquo;yiasou&rdquo; prompt</a> (also available as &ldquo;hi&rdquo;, for those of you who don&rsquo;t speak Greek 😛) is meant to be run as the first thing in the conversation. But there&rsquo;s nothing an MCP server can do to <em>ensure</em> that the user kicks off the conversation with <code>/symposium:hi</code> or something similar. Of course, if Symposium were implemented as an ACP Server, we absolutely could do that:</p>
<pre class="mermaid">sequenceDiagram
    participant E as Editor
    participant P as Proxy
    participant A as Agent
    
    E->>P: prompt("I'd like to work on my document")
    P->>A: prompt("/symposium:hi")
    A->>P: end_turn
    P->>A: prompt("I'd like to work on my document")
    A->>P: text_chunk("Sure! What document is that?") 
    P->>E: text_chunk("Sure! What document is that?") 
    A->>P: end_turn
    P->>E: end_turn
  </pre>
<h2 id="proxies-are-a-better-version-of-hooks">Proxies are a better version of hooks</h2>
<p>Some of you may be saying, &ldquo;hmm, isn&rsquo;t that what <a href="https://docs.claude.com/en/docs/claude-code/hooks">hooks</a> are for?&rdquo; And yes, you could do this with hooks, but there&rsquo;s two problems with that. First, hooks are non-standard, so you have to do it differently for every agent.</p>
<p>The second problem with hooks is that they&rsquo;re <strong>fundamentally limited</strong> to what the hook designer envisioned you might want. You only get hooks at the places in the workflow that the tool gives you, and you can only control what the tool lets you control. The next feature starts to show what I mean: as far as I know, it cannot readily be implemented with hooks the way I would want it to work.</p>
<h2 id="next-feature-long-running-asynchronous-tasks">Next feature: long-running, asynchronous tasks</h2>
<p>Let&rsquo;s move on to our next feature, long-running asynchronous tasks. This feature is going to have to go beyond the current capabilities of ACP into the expanded &ldquo;SymmACP&rdquo; feature set.</p>
<p>Right now, when the server invokes an MCP tool, it executes in a blocking way. But sometimes the task it is performing might be long and complicated. What you would really like is a way to &ldquo;start&rdquo; the task and then go back to working. When the task is complete, you (and the agent) could be notified.</p>
<p>This comes up for me a lot with &ldquo;deep research&rdquo;. A big part of my workflow is that, when I get stuck on something I don&rsquo;t understand, I deploy a research agent to scour the web for information. Usually what I will do is ask the agent I&rsquo;m collaborating with to prepare a research prompt summarizing the things we tried, what obstacles we hit, and other details that seem relevant. Then I&rsquo;ll pop over to <a href="https://claude.ai/">claude.ai</a> or <a href="https://gemini.google.com/">Gemini Deep Research</a> and paste in the prompt. This will run for 5-10 minutes and generate a markdown report in response. I&rsquo;ll download that and give it to my agent. Very often this lets us solve the problem.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></p>
<p>This research flow works well but it is tedious and requires me to copy-and-paste. What I would ideally want is an MCP tool that does the search for me and, when the results are done, hands them off to the agent so it can start processing immediately. But in the meantime, I&rsquo;d like to be able to continue working with the agent while we wait. Unfortunately, the protocol for tools provides no mechanism for asynchronous notifications like this, from what I can tell.</p>
<h2 id="symmacp--tool-invocations--unprompted-sends">SymmACP += tool invocations + unprompted sends</h2>
<p>So how would I do it with SymmACP? Well, I would want to extend the ACP protocol as it is today in two ways:</p>
<ol>
<li>I&rsquo;d like the ACP proxy to be able to provide tools that <em>the proxy</em> will execute. Today, the agent is responsible for executing all tools; the ACP protocol only comes into play when requesting <em>permission</em>. But it&rsquo;d be trivial to have MCP tools where, to execute the tool, the agent sends back a message over ACP instead.</li>
<li>I&rsquo;d like to have a way for the <em>agent</em> to initiate responses to the <em>editor</em>. Right now, the editor always initiatives each communication session with a prompt; but, in this case, the agent might want to send messages back unprompted.</li>
</ol>
<p>In that case, we could implement our Research Proxy like so:</p>
<pre class="mermaid">sequenceDiagram
    participant E as Editor
    participant P as Proxy
    participant A as Agent
    
    E->>P: prompt("Why is Rust so great?")
    P->>A: prompt("Why is Rust so great?")
    A->>P: invoke tool("begin_research")
    activate P
    P->>A: ok
    A->>P: "I'm looking into it!"
    P->>E: "I'm looking into it!"
    A->>P: end_turn
    P->>E: end_turn

    Note over E,A: Time passes (5-10 minutes) and the user keeps working...
    Note over P: Research completes in background
    
    P->>A: &lt;research-complete/&gt
    deactivate P
    A->>P: "Research says Rust is fast"
    P->>E: "Research says Rust is fast"
    A->>P: end_turn
    P->>E: end_turn
  </pre>
<p>What&rsquo;s cool about this is that the proxy encapsulates the entire flow: it knows how to do the research, and it manages notifying the various participants when the research completes. (Also, this leans on one detail I left out, which is that )</p>
<h2 id="next-feature-tangent-mode">Next feature: tangent mode</h2>
<p>Let&rsquo;s explore our next feature, <a href="https://docs.aws.amazon.com/amazonq/latest/qdeveloper-ug/command-line-experimental-features.html">Q CLI&rsquo;s <code>/tangent</code> mode</a>. This feature is interesting because it&rsquo;s a simple (but useful!) example of history editing. The way <code>/tangent</code> works is that, when you first type <code>/tangent</code>, Q CLI saves your current state. You can then continue as normal but when you <em>next</em> type <code>/tangent</code>, your state is restored to where you were. This, as the name suggests, lets you explore a side conversation without polluting your main context.</p>
<p>The basic idea for supporting tangent in SymmACP is that the proxy is going to (a) intercept the tangent prompt and remember where it began; (b) allow the conversation to continue as normal; and then (c) when it&rsquo;s time to end the tangent, create a new session and replay the history up until the point of the tangent<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>.</p>
<h2 id="symacp--replay">SymACP += replay</h2>
<p>You can <em>almost</em> implement &ldquo;tangent&rdquo; in ACP as it is, but not quite. In ACP, the agent always owns the session history. The editor can create a new session or load an older one; when loading an older one, the agent &ldquo;replays&rdquo; &ldquo;replays&rdquo; the events so that the editor can reconstruct the GUI. But there is no way for the <em>editor</em> to &ldquo;replay&rdquo; or construct a session to the <em>agent</em>. Instead, the editor can only send prompts, which will cause the agent to reply. In this case, what we want is to be able to say &ldquo;create a new chat in which I said this and you responded that&rdquo; so that we can setup the initial state. This way we could easily create a new session that contains the messages from the old one.</p>
<p>So how this would work:</p>
<pre class="mermaid">sequenceDiagram
    participant E as Editor
    participant P as Proxy
    participant A as Agent
    
    E->>P: prompt("Hi there!")
    P->>A: prompt("Hi there!")

    Note over E,A: Conversation proceeds
    
    E->>P: prompt("/tangent")
    Note over P: Proxy notes conversation state
    P->>E: end_turn
    E->>P: prompt("btw, ...")
    P->>A: prompt("btw, ...")

    Note over E,A: Conversation proceeds
    
    E->>P: prompt("/tangent")
    
    P->>A: new_session
    P->>A: prompt("Hi there!")    
    Note over P,A: ...Proxy replays conversation...
  </pre>
<h2 id="next-feature-interactive-walkthroughs">Next feature: interactive walkthroughs</h2>
<p>One of the nicer features of Symposium is the ability to do <a href="https://symposium-dev.github.io/symposium/get-started/walkthroughs.html">interactive walkthroughs</a>. These consist of an HTML sidebar as well as inline comments in the code:</p>
<img src="https://smallcultfollowing.com/babysteps/
/assets/2025-symmacp/walkthrough.png" alt="Walkthrough screenshot" width="100%"/>
<p>Right now, this is implemented by a kind of hacky dance:</p>
<ul>
<li>The agent invokes an MCP tool and sends it the walkthrough in markdown. This markdown includes commands meant to be placed on particular lines, identified not by line number (agents are bad at line numbers) but by symbol names or search strings.</li>
<li>The MCP tool parses the markdown, determines the line numbers for comments, and creates HTML. It sends that HTML over IPC to the VSCode extension.</li>
<li>The VSCode receives the IPC message, displays the HTML in the sidebar, and creates the comments in the code.</li>
</ul>
<p>It works, but it&rsquo;s a giant Rube Goldberg machine.</p>
<h2 id="symmacp--enriched-conversation-history">SymmACP += Enriched conversation history</h2>
<p>With SymmACP, we would structure the passthrough mechanism as a proxy. Just as today, it would provide an MCP tool to the agent to receive the walkthrough markdown. It would then convert that into the HTML to display on the side along with the various comments to embed in the code. But this is where things are different.</p>
<p>Instead of sending that content over IPC, what I would want to do is to make it possible for proxies to deliver extra information along with the chat. This is relatively easy to do in ACP as is, since it provides for various capabilities, but I think I&rsquo;d want to go one step further</p>
<p>I would have a proxy layer that manages walkthroughs. As we saw before, it would provide a tool. But there&rsquo;d be one additional thing, which is that, beyond just a chat history, it would be able to convey additional state. I think the basic conversation structure is like:</p>
<ul>
<li>Conversation
<ul>
<li>Turn
<ul>
<li>User prompt(s) &ndash; could be zero or more</li>
<li>Response(s) &ndash; could be zero or more</li>
<li>Tool use(s) &ndash; could be zero or more</li>
</ul>
</li>
</ul>
</li>
</ul>
<p>but I think it&rsquo;d be useful to (a) be able to attach metadata to any of those things, e.g., to add extra context <em>about the conversation</em> or <em>about a specific turn</em> (or even a specific <em>prompt</em>), but also additional kinds of events. For example, tool approvals are an <em>event</em>. And presenting a walkthrough and adding annotations are an event too.</p>
<p>The way I imagine it, one of the core things in SymmACP would be the ability to serialize your state to JSON. You&rsquo;d be able to ask a SymmACP paricipant to summarize a session. They would in turn ask any delegates to summarize and then add their own metadata along the way. You could also send the request in the <em>other</em> direction &ndash; e.g., the agent might present its state to the editor and ask it to augment it.</p>
<h2 id="enriched-history-would-let-walkthroughs-be-extra-metadata">Enriched history would let walkthroughs be extra metadata</h2>
<p>This would mean a walkthrough proxy could add extra metadata into the chat transcript like &ldquo;the current walkthrough&rdquo; and &ldquo;the current comments that are in place&rdquo;. Then the <em>editor</em> would either <em>know</em> about that metadata or not. If it doesn&rsquo;t, you wouldn&rsquo;t see it in your chat. Oh well &ndash; or perhaps we do something HTML like, where there&rsquo;s a way to &ldquo;degrade gracefully&rdquo; (e.g., the walkthrough could be presented as a regular &ldquo;response&rdquo; but with some metadata that, if you know to look, tells you to interpret it differently). But if the editor DOES know about the metadata, it interprets it specially, throwing the walkthrough up in a panel and adding the comments into the code.</p>
<p>With enriched histories, I think we can even say that in SymmACP, the ability to load, save, and persist sessions <em>itself</em> becomes an extension, something that can be implemented by a proxy; the base protocol only needs the ability to conduct and serialize a conversation.</p>
<h2 id="final-feature-smarter-tool-delegation">Final feature: Smarter tool delegation.</h2>
<p>Let me sketch out another feature that I&rsquo;ve been noodling on that I think would be pretty cool. It&rsquo;s well known that there&rsquo;s a problem that LLMs get confused when there are too many MCP tools available. They get distracted. And that&rsquo;s sensible, so would I, if I were given a phonebook-size list of possible things I could do and asked to figure something out. I&rsquo;d probably just ignore it.</p>
<p>But how do humans deal with this? Well, we don&rsquo;t take the whole phonebook &ndash; we got a shorter list of <em>categories</em> of options and then we drill down. So I go to the File Menu and <em>then</em> I get a list of options, not a flat list of commands.</p>
<p>I wanted to try building an MCP tool for IDE capabilities that was similar. There&rsquo;s a bajillion set of things that a modern IDE can &ldquo;do&rdquo;. It can find references. It can find definitions. It can get type hints. It can do renames. It can extract methods. In fact, the list is even open-ended, since extensions can provide their <em>own</em> commands. I don&rsquo;t know what all those things <em>are</em> but I have a sense for the <em>kinds of things</em> an IDE can do &ndash; and I suspect models do too.</p>
<p>What if you gave them a single tool, &ldquo;IDE operation&rdquo;, and they could use plain English to describe what they want? e.g., <code>ide_operation(&quot;find definition for the ProxyHandler that referes to HTTP proxies&quot;)</code>. Hmm, this is sounding a lot like a delegate, or a sub-agent. Because now you need to use a second LLM to interpret that request &ndash; you probably want to do something like, give it a list of sugested IDE capabilities and the ability to find out full details and ask it to come up with a plan (or maybe directly execute the tools) to find the answer.</p>
<p>As it happens, MCP <em>has</em> a capability to enable tools to do this &ndash; it&rsquo;s called (somewhat oddly, in my opinion) &ldquo;sampling&rdquo;. It allows for &ldquo;callbacks&rdquo; from the MCP tool to the LLM. But literally <em>nobody</em> implements it, from what I can tell.<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup> But sampling is kind of limited anyway. With SymmACP, I think you could do much more interesting things.</p>
<h2 id="symmacpcontainssimultaneous_sessions">SymmACP.contains(simultaneous_sessions)</h2>
<p>The key is that ACP already permits a single agent to &ldquo;serve up&rdquo; many simultaneous sessions. So that means that if I have a proxy, perhaps one supplying an MCP tool definition, I could use it to start <em>fresh</em> sessions &ndash; combine that with the &ldquo;history replay&rdquo; capability I mentioned above, and the tool can control exactly what context to bring over into that session to start from, as well, which is very cool (that&rsquo;s a challenge for MCP servers today, they don&rsquo;t get access to the conversation history).</p>
<pre class="mermaid">sequenceDiagram
    participant E as Editor
    participant P as Proxy
    participant A as Agent
    
    A->>P: ide_operation("...")
    activate P
    P->>A: new_session
    activate P
    activate A
    P->>A: prompt("Using these primitive operations, suggest a way to do '...'")
    A->>P: ...
    A->>P: end_turn
    deactivate P
    deactivate A
    Note over P: performs the plan
    P->>A: result from tool
    deactivate P
  </pre>
<h2 id="conclusion">Conclusion</h2>
<p>Ok, this post sketched a variant on <a href="https://agentclientprotocol.com/overview/introduction">ACP</a> that I call SymmACP. SymmACP extends ACP with</p>
<ul>
<li>the ability for either side to provide the initial state of a conversation, not just the server</li>
<li>the ability for an &ldquo;editor&rdquo; to provide an MCP tool to the &ldquo;agent&rdquo;</li>
<li>the ability for agents to respond without an initial prompt</li>
<li>the ability to serialize conversations and attach extra state (already kind of present)</li>
</ul>
<p>Most of these are modest extensions to ACP, in my opinion, and easily doable in a backwards fashion just by adding new capabilities. <strong>But together they unlock the ability for anyone to craft extensions to agents and deploy them in a composable way.</strong> I am super excited about this. This is exactly what I wanted Symposium to be all about.</p>
<p>It&rsquo;s worth noting the old adage: &ldquo;with great power, comes great responsibility&rdquo;. These proxies and ACP layers I&rsquo;ve been talking about are really like IDE extensions. They can effectively do <em>anything</em> you could do. There are obvious security concerns. Though I think that approaches like Microsoft&rsquo;s <a href="https://opensource.microsoft.com/blog/2025/08/06/introducing-wassette-webassembly-based-tools-for-ai-agents/">Wassette</a> are key here &ndash; it&rsquo;d be awesome to have a &ldquo;capability-based&rdquo; notion of what a &ldquo;proxy layer&rdquo; is, where everything compiles to WASM, and where users can tune what a given proxy can actually <em>do</em>.</p>
<p>I plan to start sketching a plan to drive this work in <a href="https://symposium-dev.github.io/symposium/">Symposium</a> and elsewhere. My goal is to have a completely open and interopable client, one that can be based on any agent (including local ones) and where you can pick and choose which parts you want to use. I expect to build out lots of custom functionality to support Rust development (e.g., explaining and diagnosting trait errors using the new trait solver is high on my list&hellip;and macro errors&hellip;) but also to have other features like walkthroughs, collaborative interaction style, etc that are all language independent &ndash; and I&rsquo;d love to see language-focused features for other langauges, especially Python and TypeScript (because <a href="https://smallcultfollowing.com/babysteps/blog/2025/07/31/rs-py-ts-trifecta/">&ldquo;the new trifecta&rdquo;</a>) and Swift and Kotlin (because mobile). If that vision excites you, come join the <a href="https://symposium-dev.zulipchat.com/">Symposium Zulip</a> and let&rsquo;s chat!</p>
<h2 id="appendix-a-guide-to-the-agent-protocols-im-aware-of">Appendix: A guide to the agent protocols I&rsquo;m aware of</h2>
<p>One question I&rsquo;ve gotten when discussing this is how it compares to the other host of protocols out there. Let me give a brief overview of the related work and how I understand its pros and cons:</p>
<ul>
<li><em><a href="https://modelcontextprotocol.io/docs/getting-started/intro">Model context protocol (MCP)</a>:</em> The queen of them all. A protocol that provides a set of tools, prompts, and resources up to the agent. Agents can invoke tools by supplying appropriate parameters, which are JSON. Prompts are shorthands that users can invoke using special commands like <code>/</code> or <code>@</code>, they are essentially macros that expand &ldquo;as if the user typed it&rdquo; (but they can also have parameters and be dynamically constructed). <em>Resources</em> are just data that can be requested. MCP servers can either be local or hosted remotely. Remote MCP has only recently become an option and auth in particular is limited.
<ul>
<li>Comparison to SymmACP: MCP provides tools that the agent can invoke. SymmACP builds on it by allowing those tools to be provided by outer layers in the proxy chain. SymmACP is oriented at controlling the whole chat &ldquo;experience&rdquo;.</li>
</ul>
</li>
<li><em><a href="https://agentclientprotocol.com/overview/introduction">Zed&rsquo;s Agent Client Protocol (ACP)</a>:</em> The basis for SymmACP. Allows editors to create and manage sessions. Focused only on local sessions, since your editor runs locally.
<ul>
<li>Comparison to SymmACP: That&rsquo;s what this post is all about! SymmACP extends ACP with new capabilities that let intermediate layers manipulate history, provide tools, and provide extended data upstream to support richer interaction patterns than jus chat. PS I expect we may want to support more remote capabilities, but it&rsquo;s kinda orthogonal in my opinion (e.g., I&rsquo;d like to be able to work with an agent running over in a cloud-hosted workstation, but I&rsquo;d probably piggyback on ssh for that).</li>
</ul>
</li>
<li><em><a href="https://developers.googleblog.com/en/a2a-a-new-era-of-agent-interoperability/">Google&rsquo;s Agent-to-Agent Protocol (A2A)</a> and <a href="https://www.ibm.com/think/topics/agent-communication-protocol">IBM&rsquo;s Agent Communication Protocol (ACP)</a><sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup>:</em> From what I can tell, Google&rsquo;s &ldquo;agent-to-agent&rdquo; protocol is <em>kinda</em> like a mix of MCP and OpenAPI. You can ping agents that are running remotely and get them to send you &ldquo;agent cards&rdquo;, which describe what operations they can perform, how you authenticate, and other stuff like that. It looks to me quite similar to MCP except that it has richer support for remote execution and in particular supports things like long-running communication, where an agent may need to go off and work for a while and then ping you back on a webhook.
<ul>
<li><em>Comparison to MCP:</em> To me, A2A looks like a variant of MCP that is more geared to remote execution. MCP has a method for tool discovery where you ping the server to get a list of tools; A2A has a similar mechanism with Agent Cards. MCP can run locally, which A2A cannot afaik, but A2A has more options about auth. MCP can only be invoked synchronously, whereas A2A supports long-running operations, progress updates, and callbacks. It seems like the two could be merged to make a single whole.</li>
<li><em>Comparison to SymmACP:</em> I think A2A is orthogonal from SymmACP. A2A is geared to agents that provide services to one another. SymmACP is geared towards building new development tools for interacting with agents. It&rsquo;s possible you could build something like SymmACP <em>on</em> A2A but I don&rsquo;t know what you would really gain by it (and I think it&rsquo;d be easy to do later).</li>
</ul>
</li>
</ul>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Everybody uses agents in various ways. I like Simon Willison&rsquo;s <a href="https://simonwillison.net/2025/May/22/tools-in-a-loop/">&ldquo;agents are models using tools in a loop&rdquo;</a> definition; I feel that an &ldquo;agentic CLI tool&rdquo; fits that definition, it&rsquo;s just that part of the loop is reading input from the user. I think &ldquo;fully autonomous&rdquo; agents are a subset of all agents &ndash; many agent processes interact with the outside world via tools etc. From a certain POV, you can view the agent &ldquo;ending the turn&rdquo; as invoking a tool for &ldquo;gimme the next prompt&rdquo;.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Research reports are a <strong>major</strong> part of how I avoid hallucination. You can see an example of one such report I commissioned on the <a href="https://symposium-dev.github.io/symposium/research/lsp-overview/index.html">details of the Language Server Protocol here</a>; if we were about to embark on something that required detailed knowledge of LSP, I would ask the agent to read that report first.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Alternatively: clear the session history and rebuild it, but I kind of prefer the functional view of the world, where a given session never changes.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>I started an implementation for Q CLI but got distracted &ndash; and, for reasons that should be obvious, I&rsquo;ve started to lose interest.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>Yes, you read that right. There is another ACP. Just a mite confusing when you google search. =)&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>]]></content></entry><entry><title type="html">The Handle trait</title><link href="https://smallcultfollowing.com/babysteps/blog/2025/10/07/the-handle-trait/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2025/10/07/the-handle-trait/</id><published>2025-10-07T00:00:00+00:00</published><updated>2025-10-07T10:04:55-04:00</updated><content type="html"><![CDATA[<p>There&rsquo;s been a lot of discussion lately around ergonomic ref-counting. We had a lang-team design meeting and then a quite impactful discussion at the RustConf Unconf. I&rsquo;ve been working for weeks on a follow-up post but today I realized what should&rsquo;ve been obvious from the start &ndash; that if I&rsquo;m taking that long to write a post, it means the post is too damned long. So I&rsquo;m going to work through a series of smaller posts focused on individual takeaways and thoughts. And for the first one, I want to (a) bring back some of the context and (b) talk about an interesting question, <strong>what should we call the trait</strong>. My proposal, as the title suggests, is <code>Handle</code> &ndash; but I get ahead of myself.</p>
<h2 id="the-story-thus-far">The story thus far</h2>
<p>For those of you who haven&rsquo;t been following, there&rsquo;s been an ongoing discussion about how best to have ergonomic ref counting:</p>
<ul>
<li>It began with the first Rust Project Goals program in 2024H2, where Jonathan Kelley from Dioxus wrote a <a href="https://dioxus.notion.site/Dioxus-Labs-High-level-Rust-5fe1f1c9c8334815ad488410d948f05e">thoughtful blog post about a path to high-level Rust</a> that eventually became a 2024H2 <a href="https://rust-lang.github.io/rust-project-goals/2024h2/ergonomic-rc.html">project goal towards ergonomic ref-counting</a>.</li>
<li>I wrote a <a href="https://smallcultfollowing.com/babysteps/series/claim/">series of blog posts about a trait I called <code>Claim</code></a>.</li>
<li>Josh and I talked and Josh opened <a href="https://github.com/rust-lang/rfcs/pull/3680">RFC #3680</a>, which proposed a <code>use</code> keyword and <code>use ||</code> closures. Reception, I would say, was mixed; yes, this is tackling a real problem, but there were lots of concerns on the approach. <a href="https://github.com/rust-lang/rfcs/pull/3680#issuecomment-2625526944">I summarized the key points here</a>.</li>
<li>Santiago implemented experimental support for (a variant of) <a href="https://github.com/rust-lang/rfcs/pull/3680">RFC #3680</a> as part of the <a href="https://rust-lang.github.io/rust-project-goals/2025h1/ergonomic-rc.html">2025H1 project goal</a>.</li>
<li>I authored a <a href="https://rust-lang.github.io/rust-project-goals/2025h2/ergonomic-rc.html">2025H2 project goal proposing that we create an alternative RFC focused on higher-level use-cases</a> which prompted Josh and I have to have a long and fruitful conversation in which he convinced me that this was not the right approach.</li>
<li>We had a lang-team design meeting on 2025-08-27 in which I presented this <a href="https://hackmd.io/@rust-lang-team/B12TpGhKle">survey and summary of the work done thus far</a>.</li>
<li>And then at the <a href="https://2025.rustweek.org/unconf/">RustConf 2025 Unconf</a> we had a big group discussion on the topic that I found very fruitful, as well as various follow-up conversations with smaller groups.</li>
</ul>
<h2 id="this-blog-post-is-about-the-trait">This blog post is about &ldquo;the trait&rdquo;</h2>
<p>The focus of this blog post is on one particular question: what should we call &ldquo;The Trait&rdquo;. In virtually every design, there has been <em>some kind</em> of trait that is meant to identify <em>something</em>. But it&rsquo;s been hard to get a handle<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> on what precisely that <em>something</em> is. What is this trait for and what types should implement it? Some things are clear: whatever The Trait is, <code>Rc&lt;T&gt;</code> and <code>Arc&lt;T&gt;</code> should implement it, for example, but that&rsquo;s about it.</p>
<p>My original proposal was for a trait named <a href="https://smallcultfollowing.com/babysteps/blog/2024/06/21/claim-auto-and-otherwise/"><code>Claim</code></a> that was meant to convey a &ldquo;lightweight clone&rdquo; &ndash; but really the trait was <a href="https://smallcultfollowing.com/babysteps/blog/2024/06/26/claim-followup-1/#what-i-really-proposed">meant to replace <code>Copy</code> as the definition of which clones ought to be explicit</a><sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>. Jonathan Kelley had a similar proposal but called it <code>Capture</code>. In <a href="https://github.com/rust-lang/rfcs/pull/3680">RFC #3680</a> the proposal was to call the trait <code>Use</code>.</p>
<p>The details and intent varied, but all of these attempts had one thing in common: they were very <em>operational</em>. That is, the trait was always being defined in terms of <em>what</em> it does (or doesn&rsquo;t do) but not <em>why</em> it does it. And that I think will always be a weak grounding for a trait like this, prone to confusion and different interpretations. For example, what is a &ldquo;lightweight&rdquo; clone? Is it O(1)? But what about things that are O(1) with very high probability? And of course, O(1) doesn&rsquo;t mean <em>cheap</em> &ndash; it might copy 22GB of data every call. That&rsquo;s O(1).</p>
<p>What you want is a trait where it&rsquo;s fairly clear when it should and should not be implemented and not based on taste or subjective criteria. And <code>Claim</code> and friends did not meet the bar: in the Unconf, several new Rust users spoke up and said they found it very hard, based on my explanations, to judge whether their types ought to implement The Trait (whatever we call it). That has also been a persitent theme from the RFC and elsewhere.</p>
<h2 id="shouldnt-we-call-it-share-hat-tip-jack-huey">&ldquo;Shouldn&rsquo;t we call it <em>share</em>?&rdquo; (hat tip: Jack Huey)</h2>
<p>But really there <em>is</em> a semantic underpinning here, and it was Jack Huey who first suggested it. Consider this question. What are the differences between cloning a <code>Mutex&lt;Vec&lt;u32&gt;&gt;</code> and a <code>Arc&lt;Mutex&lt;Vec&lt;u32&gt;&gt;&gt;</code>?</p>
<p>One difference, of course, is cost. Cloning the <code>Mutex&lt;Vec&lt;u32&gt;&gt;</code> will deep-clone the vector, cloning the <code>Arc</code> will just increment a referece count.</p>
<p>But the more important difference is what I call <em>&ldquo;entanglement&rdquo;</em>. When you clone the <code>Arc</code>, you don&rsquo;t get a new value &ndash; you get back a <em>second handle to the same value</em>.<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></p>
<h2 id="entanglement-changes-the-meaning-of-the-program">Entanglement changes the meaning of the program</h2>
<p>Knowing which values are &ldquo;entangled&rdquo; is key to understanding what your program does. A big part of how the borrow checker<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup> achieves reliability is by reducing &ldquo;entaglement&rdquo;, since it becomes a relative pain to work with in Rust.</p>
<p>Consider the following code. What will be the value of <code>l_before</code> and <code>l_after</code>?</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">l_before</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">v1</span><span class="p">.</span><span class="n">len</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">v2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">v1</span><span class="p">.</span><span class="n">clone</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">v2</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="n">new_value</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">l_after</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">v1</span><span class="p">.</span><span class="n">len</span><span class="p">();</span><span class="w">
</span></span></span></code></pre></div><p>The answer, of course, is &ldquo;depends on the type of <code>v1</code>&rdquo;. If <code>v1</code> is a <code>Vec</code>, then <code>l_after == l_before</code>. But if <code>v1</code> is, say, a struct like this one:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">SharedVec</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span>: <span class="nc">Arc</span><span class="o">&lt;</span><span class="n">Mutex</span><span class="o">&lt;</span><span class="nb">Vec</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;&gt;&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">SharedVec</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">push</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">value</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="n">lock</span><span class="p">().</span><span class="n">unwrap</span><span class="p">().</span><span class="n">push</span><span class="p">(</span><span class="n">value</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">len</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">usize</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="n">lock</span><span class="p">().</span><span class="n">unwrap</span><span class="p">().</span><span class="n">len</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>then <code>l_after == l_before + 1</code>.</p>
<p>There are many types that act like a <code>SharedVec</code>: it&rsquo;s true for <code>Rc</code> and <code>Arc</code>, of course, but also for things like <a href="https://docs.rs/bytes/latest/bytes/struct.Bytes.html"><code>Bytes</code></a> and channel endpoints like <a href="https://doc.rust-lang.org/std/sync/mpsc/struct.Sender.html"><code>Sender</code></a>. All of these are examples of &ldquo;handles&rdquo; to underlying values and, when you clone them, you get back a second handle that is indistinguishable from the first one.</p>
<h2 id="we-have-a-name-for-this-concept-already-handles">We have a name for this concept already: handles</h2>
<p>Jack&rsquo;s insight was that we should focus on the <em>semantic concept</em> (sharing) and not on the operational details (how it&rsquo;s implemented). This makes it clear when the trait ought to be implemented. I liked this idea a lot, although I eventually decided I didn&rsquo;t like the name <code>Share</code>. The word isn&rsquo;t specific enough, I felt, and users might not realize it referred to a specific concept: &ldquo;shareable types&rdquo; doesn&rsquo;t really sound right. But in fact there <em>is</em> a name already in common use for this concept: handles (see e.g. <a href="https://docs.rs/tokio/latest/tokio/runtime/struct.Handle.html"><code>tokio::runtime::Handle</code></a>).</p>
<p>This is how I arrived at my proposed name and definition for The Trait, which is <code>Handle</code>:<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup></p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="sd">/// Indicates that this type is a *handle* to some
</span></span></span><span class="line"><span class="cl"><span class="sd">/// underlying resource. The `handle` method is
</span></span></span><span class="line"><span class="cl"><span class="sd">/// used to get a fresh handle.
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Handle</span>: <span class="nb">Clone</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kr">final</span><span class="w"> </span><span class="k">fn</span> <span class="nf">handle</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nb">Clone</span>::<span class="n">clone</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="we-would-lint-and-advice-people-to-call-handle">We would lint and advice people to call <code>handle</code></h2>
<p>The <code>Handle</code> trait includes a method <code>handle</code> which is <em>always</em> equivalent to <code>clone</code>. The purpose of this method is to signal to the reader that the result is a second handle to the same underlying value.</p>
<p>Once the <code>Handle</code> trait exists, we should lint on calls to <code>clone</code> when the receiver is known to implement <code>Handle</code> and encourage folks to call <code>handle</code> instead:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">DataStore</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">store_map</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">map</span>: <span class="kp">&amp;</span><span class="nc">Arc</span><span class="o">&lt;</span><span class="n">HashMap</span><span class="o">&lt;..</span><span class="p">.</span><span class="o">&gt;&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">stored_map</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">map</span><span class="p">.</span><span class="n">clone</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//                    -----
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// Lint: convert `clone` to `handle` for
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// greater clarity.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Compare the above to the version that the lint suggests, using <code>handle</code>, and I think you will get an idea for how <code>handle</code> increases clarity of what is happening:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">DataStore</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">store_map</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">map</span>: <span class="kp">&amp;</span><span class="nc">Arc</span><span class="o">&lt;</span><span class="n">HashMap</span><span class="o">&lt;..</span><span class="p">.</span><span class="o">&gt;&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">stored_map</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">map</span><span class="p">.</span><span class="n">handle</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="what-it-means-to-be-a-handle">What it means to be a <em>handle</em></h2>
<p>The defining characteristic of a <em>handle</em> is that it, when cloned, results in a second value that accesses the same underlying value. This means that the two handles are &ldquo;entangled&rdquo;, with interior mutation that affects one handle showing up in the other. Reflecting this, most handles have APIs that consist exclusively or almost exclusively of <code>&amp;self</code> methods, since having unique access to the <em>handle</em> does not necessarily give you unique access to the <em>value</em>.</p>
<p>Handles are generally only significant, semantically, when interior mutability is involved. There&rsquo;s nothing <em>wrong</em> with having two handles to an immutable value, but it&rsquo;s not generally distinguishable from two copies of the same value. This makes persistent collections an interesting grey area: I would probably implement <code>Handle</code> for something like <code>im::Vec&lt;T&gt;</code>, particularly since something like a <code>im::Vec&lt;Cell&lt;u32&gt;&gt;</code> <em>would</em> make entaglement visible, but I think there&rsquo;s an argument against it.</p>
<h2 id="handles-in-the-stdlib">Handles in the stdlib</h2>
<p>In the stdlib, handle would be implemented for exactly one <code>Copy</code> type (the others are values):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// Shared references, when cloned (or copied),
</span></span></span><span class="line"><span class="cl"><span class="c1">// create a second reference:
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="o">?</span><span class="nb">Sized</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Handle</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="o">&amp;</span><span class="n">T</span><span class="w"> </span><span class="p">{}</span><span class="w">
</span></span></span></code></pre></div><p>It would be implemented for ref-counted pointers (but not <code>Box</code>):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// Ref-counted pointers, when cloned,
</span></span></span><span class="line"><span class="cl"><span class="c1">// create a second reference:
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="o">?</span><span class="nb">Sized</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Handle</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Rc</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="o">?</span><span class="nb">Sized</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Handle</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Arc</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{}</span><span class="w">
</span></span></span></code></pre></div><p>And it would be implemented for types like channel endpoints, that are implemented with a ref-counted value under the hood:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// mpsc &#34;senders&#34;, when cloned, create a
</span></span></span><span class="line"><span class="cl"><span class="c1">// second sender to the same underlying channel:
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="o">?</span><span class="nb">Sized</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Handle</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">mpsc</span>::<span class="n">Sender</span><span class="w"> </span><span class="p">{}</span><span class="w">
</span></span></span></code></pre></div><h2 id="conclusion-a-design-axiom-emerges">Conclusion: a design axiom emerges</h2>
<p>OK, I&rsquo;m going to stop there with this &ldquo;byte-sized&rdquo; blog post. More to come! But before I go, let me layout what I believe to be a useful &ldquo;design axiom&rdquo; that we should adopt for this design:</p>
<blockquote>
<p><strong>Expose entanglement.</strong> Understanding the difference between a <em>handle</em> to an underlying value and the value itself is necessary to understand how Rust works.</p>
</blockquote>
<p>The phrasing feels a bit awkward, but I think it is the key bit anyway.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>That. my friends,  is <em>foreshadowing</em>. Damn I&rsquo;m good.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>I described <code>Claim</code> as a kind of &ldquo;lightweight clone&rdquo; but in the Unconf someone pointed out that &ldquo;heavyweight copy&rdquo; was probably a better description of what I was going for.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>And, not coincidentally, the types where cloning leads to entanglement tend to also be the types where cloning is cheap.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>and functional programming&hellip;&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>The &ldquo;final&rdquo; keyword was proposed by Josh Triplett in RFC 3678. It means that impls cannot change the definition of <code>Handle::handle</code>. There&rsquo;s been some back-and-forth on whether it ought to be renamed or made more general or what have you; all I know is, I find it an incredibly useful concept for cases like this, where you want users to be able to opt-in to a method being <em>available</em> but <em>not</em> be able to change what it does. You can do this in other ways, they&rsquo;re just weirder.&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/ergonomic-rc" term="ergonomic-rc" label="Ergonomic RC"/></entry><entry><title type="html">Symposium: exploring new AI workflows</title><link href="https://smallcultfollowing.com/babysteps/blog/2025/09/24/symposium/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2025/09/24/symposium/</id><published>2025-09-24T00:00:00+00:00</published><updated>2025-09-24T16:39:46-04:00</updated><content type="html"><![CDATA[<div style="overflow: auto;">
<img src="https://smallcultfollowing.com/babysteps/
/assets/2025-09-24-symposium/logo-alcove.png" alt="Screenshot of the Symposium app" width="25%" style="float: left; margin-right: 15px; margin-bottom: 10px;"/>
<p>This blog post gives you a tour of <a href="https://github.com/symposium-dev/symposium">Symposium</a>, a wild-and-crazy project that I&rsquo;ve been obsessed with over the last month or so. Symposium combines an MCP server, a VSCode extension, an OS X Desktop App, and some <a href="https://github.com/symposium-dev/symposium/blob/main/symposium/mcp-server/src/guidance/main.md">mindful prompts</a> to forge new ways of working with agentic CLI tools.</p>
</div>
<p>Symposium is currently focused on my setup, which means it works best with VSCode, Claude, Mac OS X, and Rust. But it&rsquo;s meant to be unopinionated, which means it should be easy to extend to other environments (and in particular it already works great with other programming languages). The goal is not to compete with or replace those tools but to combine them together into something new and better.</p>
<p>In addition to giving you a tour of Symposium, this blog post is an invitation: <a href="https://github.com/symposium-dev/symposium">Symposium is an open-source project</a>, and I&rsquo;m looking for people to explore with me! If you are excited about the idea of inventing new styles of AI collaboration, join the <a href="https://symposium-dev.zulipchat.com">symposium-dev Zulip</a>. Let&rsquo;s talk!</p>
<h2 id="demo-video">Demo video</h2>
<p>I&rsquo;m not normally one to watch videos online. But in this particular case, I do think a movie is going to be worth 1,000,000 words. Therefore, I&rsquo;m embedding a short video (6min) demonstrating how Symposium works below. Check it out! But don&rsquo;t worry, if videos aren&rsquo;t your thing, you can just read the rest of the post instead.</p>
<div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
      <iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share; fullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube.com/embed/gSGYYdrTFUk?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"></iframe>
    </div>

<p>Alternatively, if you <em>really</em> love videos, you can watch the <a href="https://youtu.be/HQcIp-IBj0Q">first version I made, which went into more depth</a>. That version came in at 20 minutes, which I decided was&hellip;a bit much. 😁</p>
<h2 id="taskspaces-let-you-juggle-concurrent-agents">Taskspaces let you juggle concurrent agents</h2>
<p>The Symposium story begins with <code>Symposium.app</code>, an OS X desktop application for managing <em>taskspaces</em>. A taskspace is a clone of your project<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> paired with an agentic CLI tool that is assigned to complete some task.</p>
<p>My observation has been that most people doing AI development spend a lot of time waiting while the agent does its thing. Taskspaces let you switch quickly back and forth.</p>
<p>Before I was using taskspaces, I was doing this by jumping between different projects. I found that was really hurting my brain from context switching. But jumping between <em>tasks</em> in a project is much easier. I find it works best to pair a complex topic with some simple refactorings.</p>
<p>Here is what it looks like to use Symposium:</p>
<img src="https://smallcultfollowing.com/babysteps/
/assets/2025-09-24-symposium/taskspaces.png" alt="Screenshot of the Symposium app" width="100%"/>
<p>Each of those boxes is a taskspace. It has both its own isolated directory on the disk and an associated VSCode window. When you click on the taskspace, the app brings that window to the front. It can also hide other windows by positioning them exactly behind the first one in a stack<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>. So it&rsquo;s kind of like a mini window manager.</p>
<p>Within each VSCode window, there is a terminal running an agentic CLI tool that has the Symposium <a href="https://modelcontextprotocol.io/docs/getting-started/intro">MCP server</a>. If you&rsquo;re not familiar with MCP, it&rsquo;s a way for an LLM to invoke custom tools; it basically just gives the agent a list of available tools and a JSON scheme for what arguments they expect.</p>
<p>The Symposium MCP server does a bunch of things&ndash;we&rsquo;ll talk about more of them later&ndash;but one of them is that it lets the agent interact with taskspaces. The agent can use the MCP server to post logs and signal progress (you can see the logs in that screenshot); it can also spawn new taskspaces. I find that last part very handy.</p>
<p>It often happens to me that while working on one idea, I find opportunities for cleanups or refactorings. Nowadays I just spawn out a taskspace with a quick description of the work to be done. Next time I&rsquo;m bored, I can switch over and pick that up.</p>
<h2 id="an-aside-the-symposium-app-is-written-in-swift-a-language-i-did-not-know-3-weeks-ago">An aside: the Symposium app is written in Swift, a language I did not know 3 weeks ago</h2>
<p>It&rsquo;s probably worth mentioning that the Symposium app is written in Swift. I did not know Swift three weeks ago. But I&rsquo;ve now written about 6K lines and counting. I feel like I&rsquo;ve got a pretty good handle on how it works.<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></p>
<p>Well, it&rsquo;d be more accurate to say that I have <em>reviewed</em> about 6K lines, since most of the time Claude generates the code. I mostly read it and offer suggestions for improvement<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup>. When I do dive in and edit the code myself, it&rsquo;s interesting because I find I don&rsquo;t have the muscle memory for the syntax. I think this is pretty good evidence for the fact that agentic tools help you get started in a new programming language.</p>
<h2 id="walkthroughs-let-ais-explain-code-to-you">Walkthroughs let AIs explain code to you</h2>
<p>So, while taskspaces let you jump between tasks, the rest of Symposium is dedicated to helping you complete an individual task. A big part of that is trying to go beyond the limits of the CLI interface by connecting the agent up to the IDE. For example, the Symposium MCP server has a tool called <code>present_walkthrough</code> which lets the agent present you with a markdown document that explains how some code works. These walkthroughs show up in a side panel in VSCode:</p>
<img src="https://smallcultfollowing.com/babysteps/
/assets/2025-09-24-symposium/walkthrough.png" alt="Walkthrough screenshot" width="100%"/>
<p>As you can see, the walkthroughs can embed mermaid, which is pretty cool. It&rsquo;s sometimes so clarifying to see a flowchart or a sequence diagram.</p>
<p>Walkthroughs can also embed <em>comments</em>, which are anchored to particular parts of the code. You can see one of those in the screenshot too, on the right.</p>
<p>Each comment has a Reply button that lets you respond to the comment with further questions or suggest changes; you can also select random bits of text and use the &ldquo;code action&rdquo; called &ldquo;Discuss in Symposium&rdquo;. Both of these take you back to the terminal where your agent is running. They embed a little bit of XML (<code>&lt;symposium-ref id=&quot;...&quot;/&gt;</code>) and then you can just type as normal. The agent can then use another MCP tool to expand that reference to figure out what you are referring to or what you are replying to.</p>
<p>To some extent, this &ldquo;reference the thing I&rsquo;ve selected&rdquo; functionality is &ldquo;table stakes&rdquo;, since Claude Code already does it. But Symposium&rsquo;s version works anywhere (Q CLI doesn&rsquo;t have that functionality, for example) and, more importantly, it lets you embed multiple refrences at once. I&rsquo;ve found that to be really useful. Sometimes I&rsquo;ll wind up with a message that is replying to one comment while referencing two or three other things, and the <code>&lt;symposium-ref/&gt;</code> system lets me do that no problem.</p>
<h2 id="integrating-with-ide-knowledge">Integrating with IDE knowledge</h2>
<p>Symposium also includes an <code>ide-operations</code> tool that lets the agent connect to the IDE to do things like &ldquo;find definitions&rdquo; or &ldquo;find references&rdquo;. To be honest I haven&rsquo;t noticed this being that important (Claude is surprisingly handy with awk/sed) but I also haven&rsquo;t done much tinkering with it. I know there are other MCP servers out there too, like <a href="https://github.com/oraios/serena">Serena</a>, so maybe the right answer is just to import one of those, but I think there&rsquo;s a lot of interesting stuff we <em>could</em> do here by integrating deeper knowledge of the code, so I have been trying to keep it &ldquo;in house&rdquo; for now.</p>
<h2 id="leveraging-rust-conventions">Leveraging Rust conventions</h2>
<p>Continuing our journey down the stack, let&rsquo;s look at one more bit of functionality, which are MCP tools aimed at making agents better at working with Rust code. By far the most effective of these so far is one I call <a href="https://symposium-dev.github.io/symposium/design/mcp-tools/rust-development.html#get_rust_crate_source"><code>get_rust_crate_source</code></a>. It is very simple: given the name of a crate, it just checks out the code into a temporary directory for the agent to use. Well, actually, it does a <em>bit</em> more than that. If the agent supplies a search string, it also searches for that string so as to give the agent a &ldquo;head start&rdquo; in finding the relevant code, and it makes a point to highlight code in the examples directory in particular.</p>
<h2 id="we-could-do-a-lot-more-with-rust">We could do a lot more with Rust&hellip;</h2>
<p>My experience has been that this tool makes all the difference. Without it, Claude just geneates plausible-looking APIs that don&rsquo;t really exist. With it, Claude generally figures out exactly what to do. But really it&rsquo;s just scratching the surface of what we can do. I am excited to go deeper here now that the basic structure of Symposium is in place &ndash; for example, I&rsquo;d love to develop Rust-specific code reviewers that can critique the agent&rsquo;s code or offer it architectural advice<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup>, or a tool like <a href="https://github.com/plasma-umass/CWhy">CWhy</a> to help people resolve Rust trait errors or macro problems.</p>
<h2 id="and-can-we-decentralize-it">&hellip;and can we decentralize it?</h2>
<p>But honestly what I&rsquo;m <em>most</em> excited about is the idea of <strong>decentralizing</strong>. I want Rust library authors to have a standard way to attach custom guidance and instructions that will help agents use their library. I want an AI-enhanced variant of <code>cargo upgrade</code> that automatically bridges over major versions, making use of crate-supplied metadata about what changed and what rewrites are needed. Heck, I want libraries to be able to ship with MCP servers implemented in WASM (<a href="https://opensource.microsoft.com/blog/2025/08/06/introducing-wassette-webassembly-based-tools-for-ai-agents/">Wassette</a>, anyone?) so that Rust developers using that library can get custom commands and tools for working with it. I don&rsquo;t 100% know what this looks like but I&rsquo;m keen to explore it. If there&rsquo;s one thing I&rsquo;ve learned from Rust, it&rsquo;s always bet on the ecosystem.</p>
<h2 id="looking-further-afield-can-we-use-agents-to-help-humans-collaborate-better">Looking further afield, can we use agents to help humans collaborate better?</h2>
<p>One of the things I am very curious to explore is how we can use agents to help humans collaborate better. It&rsquo;s oft observed that coding with agents can be a bit lonely<sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup>. But I&rsquo;ve also noticed that structuring a project for AI consumption requires relatively decent documentation. For example, one of the things I did recently for Symposium was to create a Request for Dialogue (RFD) process &ndash; a simplified version of Rust&rsquo;s RFC process. My motivation was partly in anticipation of trying to grow a community of contributors, but it was also because most every major refactoring or feature work I do begins with iterating on docs. The doc becomes a central tracking record so that I can clear the context and rest assured that I can pick up where I left off. But a nice side-effect is that the project has more docs than you might expect, considering, and I hope that will make it easier to dive in and get acquainted.</p>
<p>And what about other things? Like, I think that taskspaces should really be associated with github issues. If we did that, could we do a better job at helping new contributors pick up an issue? Or at providing mentoring instructions to get started?</p>
<p>What about memory? I really want to add in some kind of automated memory system that accumulates knowledge about the system more automatically. But could we then share that knowledge (or a subset of it) across users, so that when I go to hack on a project, I am able to &ldquo;bootstrap&rdquo; with the accumulated observations of other people who&rsquo;ve been working on it?</p>
<p>Can agents help in guiding and shepherding design conversations? At work, when I&rsquo;m circulating a document, I will typically download a copy of that document with people&rsquo;s comments embedded in it. Then I&rsquo;ll use pandoc to convert that into Markdown with HTML comments and then ask Claude to read it over and help me work through the comments systematically. Could we do similar things to manage unwieldy RFC threads?</p>
<p>This is part of what gets me excited about AI. I mean, don&rsquo;t get me wrong. I&rsquo;m scared too. There&rsquo;s no question that the spread of AI will change a lot of things in our society, and definitely not always for the better. But it&rsquo;s also a huge opportunity. AI is empowering! Suddenly, learning new things is just <em>vastly</em> easier. And when you think about the potential for integrating AI into community processes, I think that it could easily be used to bring us closer together and maybe even to make progress on previously intractable problems in open-source<sup id="fnref:7"><a href="#fn:7" class="footnote-ref" role="doc-noteref">7</a></sup>.</p>
<h2 id="conclusion-want-to-build-something-cool">Conclusion: Want to build something cool?</h2>
<p>As I said in the beginning, this post is two things. Firstly, it&rsquo;s an advertisement for Symposium. If you think the stuff I described sounds cool, give Symposium a try! You can find <a href="https://symposium-dev.github.io/symposium/install.html">installation instructions</a> here. I gotta warn you, as of this writing, I think I&rsquo;m the only user, so I would not at all be surprised to find out that there&rsquo;s bugs in setup scripts etc. But hey, try it out, find bugs and tell me about them! Or better yet, fix them!</p>
<p>But secondly, and more importantly, this blog post is an invitation to come out and play<sup id="fnref:8"><a href="#fn:8" class="footnote-ref" role="doc-noteref">8</a></sup>. I&rsquo;m keen to have more people come and hack on Symposium. There&rsquo;s so much we could do! I&rsquo;ve identified a number of <a href="">&ldquo;good first issue&rdquo; bugs</a>. Or, if you&rsquo;re keen to take on a larger project, I&rsquo;ve got a set of invited &ldquo;Request for Dialogue&rdquo; projects you could pick up and make your own. And if none of that suits your fancy, feel free to pitch you own project &ndash; just join the <a href="https://symposium-dev.zulipchat.com">Zulip</a> and open a topic!</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Technically, a git worktree.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>That&rsquo;s what the &ldquo;Stacked&rdquo; box does; if you uncheck it, the windows can be positioned however you like. I&rsquo;m also working on a tiled layout mode.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Well, mostly. I still have some warnings about something or other not being threadsafe that I&rsquo;ve been ignoring. Claude assures me they are not a big deal (Claude can be so lazy omg).&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>Mostly: &ldquo;Claude will you please for the love of God stop copying every function ten times.&rdquo;&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>E.g., don&rsquo;t use a tokio mutex you fool, <a href="https://ryhl.io/blog/actors-with-tokio/">use an actor</a>. That is one particular bit of advice I&rsquo;ve given more than once.&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p>I&rsquo;m kind of embarassed to admit that Claude&rsquo;s dad jokes have managed to get a laugh out of me on occassion, though.&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:7">
<p>Narrator voice: <em>burnout. he means maintainer burnout.</em>&#160;<a href="#fnref:7" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:8">
<p>Tell me you went to high school in the 90s without telling me you went to high school in the 90s.&#160;<a href="#fnref:8" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>]]></content></entry><entry><title type="html">Rust, Python, and TypeScript: the new trifecta</title><link href="https://smallcultfollowing.com/babysteps/blog/2025/07/31/rs-py-ts-trifecta/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2025/07/31/rs-py-ts-trifecta/</id><published>2025-07-31T00:00:00+00:00</published><updated>2025-07-31T09:52:16-04:00</updated><content type="html"><![CDATA[<p>You heard it here first: my guess is that Rust, Python, and TypeScript are going to become the dominant languages going forward (excluding the mobile market, which has extra wrinkles). The argument is simple. Increasing use of AI coding is going to weaken people&rsquo;s loyalty to programming languages, moving it from what is often a tribal decision to one based on fundamentals. And the fundamentals for those 3 languages look pretty strong to me: Rust targets system software or places where efficiency is paramount. Python brings a powerful ecosystem of mathematical and numerical libraries to bear and lends itself well to experimentation and prototyping. And TypeScript, of course, is compiled to JavaScript which runs natively on browsers and the web and a number of other areas. And all of them, at least if setup properly, offer strong static typing and the easy use of dependencies. Let&rsquo;s walk through the argument point by point.</p>
<h2 id="ai-is-moving-us-towards-idea-oriented-programming">AI is moving us towards <em>idea-oriented programming</em></h2>
<p>Building with an LLM is presently a rather uneven experience, but I think the long-term trend is clear enough. We are seeing a shift towards a new programming paradigm. Dave Herman and I have recently taken to calling it <strong>idea-oriented programming</strong>. As the name suggests, <em>idea-oriented programming</em> is <em>programming where you are focused first and foremost on <strong>ideas</strong> behind your project</em>.</p>
<p>Why do I say <em>idea-oriented programming</em> and not <em>vibe coding</em>? To me, they are different beasts. Vibe coding suggests a kind of breezy indifference to the specifics &ndash; kind of waving your hand vaguely at the AI and saying &ldquo;do something like this&rdquo;.
That smacks of <a href="https://smallcultfollowing.com/babysteps/blog/2025/07/24/collaborative-ai-prompting/">treating the AI like a genie</a> &ndash; or perhaps a servant, neither of which I think is useful.</p>
<h2 id="idea-oriented-programming-is-very-much-programming">Idea-oriented programming is very much <strong>programming</strong></h2>
<p>Idea-oriented programming, in contrast, is definitely <strong>programming</strong>. But your role is different. As the programmer, you&rsquo;re more like the chief architect. Your coding tools are like your apprentices. You are thinking about the goals and the key aspects of the design. You lay out a crisp plan and delegate the heavy lifting to the tools &ndash; and then you review their output, making tweaks and, importantly, generalizing those tweaks into persistent principles. When some part of the problem gets tricky, you are rolling up your sleeves and do some hands-on debugging and problem solving.</p>
<p>If you&rsquo;ve been in the industry a while, this description will be familiar. It&rsquo;s essentially the role of a Principal Engineer. It&rsquo;s also a solid description of what I think an open-source mentor ought to do.</p>
<h2 id="idea-oriented-programming-changes-the-priorities-for-language-choice">Idea-oriented programming changes the priorities for language choice</h2>
<p>In the past, when I built software projects, I would default to Rust. It&rsquo;s not that Rust is the best choice for everything. It&rsquo;s that I know Rust best, and so I move the fastest when I use it. I would only adopt a different language if it offered a compelling advantage (or of course if I just wanted to try a new language, which I do enjoy).</p>
<p>But when I&rsquo;m buiding things with an AI assistant, I&rsquo;ve found I think differently. I&rsquo;m thinking more about what libraries are available, what my fundamental performance needs are, and what platforms I expect to integrate with. I want things to be as straightforward and high-level as I can get them, because that will give the AI the best chance of success and minimize my need to dig in. The result is that I wind up with a mix of Python (when I want access to machine-learning libraries), TypeScript (when I&rsquo;m building a web app, VSCode Extension, or something else where the native APIs are in TypeScript), and Rust otherwise.</p>
<p>Why Rust as the default? Well, I like it of course, but more importantly I know that its type system will catch errors up front and I know that its overall design will result in performant code that uses relatively little memory. If I am then going to run that code in the cloud, that will lower my costs, and if I&rsquo;m running it on my desktop, it&rsquo;ll give more RAM for Microsoft Outlook to consume.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<h2 id="type-systems-are-hugely-important-for-idea-oriented-programming">Type systems are hugely important for idea-oriented programming</h2>
<p>LLMs kind of turn the tables on what we expect from a computer. Typical computers can cross-reference vast amounts of information and perform deterministic computations lightning fast, but falter with even a whiff of ambiguity. LLMs, in contrast, can be surprisingly creative and thoughtful, but they have limited awareness of things that are not right in front of their face, unless they correspond to some pattern that is ingrained from training. They&rsquo;re a lot more like humans that way. And the technologies we have for dealing with that, like RAG or memory MCP servers, are mostly about trying to put things in front of their face that they might find useful.</p>
<p>But of course programmers have evolved a way to cope with human&rsquo;s narrow focus: type systems, and particularly advanced type systems. Basic type systems catch small mistakes, like arguments of the wrong type. But more advanced type systems, like the ones in Rust and TypeScript, also capture domain knowledge and steer you down a path of success: using a Rust enum, for example, captures both which state your program is in and the data that is relevant to that state. This means that you can&rsquo;t accidentally read a field that isn&rsquo;t relevant at the moment. This is important for you, but it&rsquo;s even more important for your AI collaborator(s), because they don&rsquo;t have the comprehensive memory that you do, and are quite unlikely to remember those kind of things.</p>
<p>Notably, Rust, TypeScript, and Python all have pretty decent type systems. For Python you have to set things up to use mypy and pydantic.</p>
<h2 id="ecosystems-and-package-managers-are-more-important-than-ever">Ecosystems and package managers are more important than ever</h2>
<p>Ecosystems and package managers are also hugely important to idea-oriented programming. Of course, having a powerful library to build on has always been an accelerator, but it also used to come with a bigger downside, because you had to take the time to get fluent in how the library works. That is much less of an issue now. For example, I have been building a <a href="https://github.com/nikomatsakis/www.family-tree/">family tree application</a><sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> to use with my family. I wanted to add graphical rendering. I talked out the high-level ideas but I was able to lean on Claude to manage the use of the d3 library &ndash; it turned out beautifully!</p>
<p>Notably, Rust, TypeScript, and Python all have pretty decent package managers &ndash; <code>cargo</code>, <code>npm</code>, and <code>uv</code> respectively (both TS and Python have other options, I&rsquo;ve not evaluated those in depth).</p>
<h2 id="syntactic-papercuts-and-non-obvious-workarounds-matter-less-but-error-messages-and-accurate-guidance-are-still-important">Syntactic papercuts and non-obvious workarounds matter less, but error messages and accurate guidance are still important</h2>
<p>In 2016, Aaron Turon and I gave a <a href="https://www.youtube.com/watch?v=pTQxHIzGqFI">RustConf keynote</a> advocating for the <a href="https://blog.rust-lang.org/2017/03/02/lang-ergonomics/">Ergonomics Initiative</a>. Our basic point was that there were (and are) a lot of errors in Rust that are simple to solve &ndash; but only if you know the trick. If you don&rsquo;t know the trick, they can be complete blockers, and can lead you to abandon the language altogether, even if the answer to your problem was just add a <code>*</code> in the right place.</p>
<p>In Rust, we&rsquo;ve put a lot of effort into addressing those, either by changing the language or, more often, by changing our error messages to guide you to success. What I&rsquo;ve observed is that, with Claude, the calculus is different. Some of these mistakes it simply never makes. Others it makes but then, based on the error message, is able to quickly correct. And this is fine. If I were writing the code by hand, I get annoyed having to apply the same repetitive changes over and over again (add <code>mut</code>, ok, no, take it away, etc etc). But if Claude is doing, I don&rsquo;t care so much, and maybe I get some added benefit &ndash; e.g., now I have a clearer indicating of which variables are declared as <code>mut</code>.</p>
<p>But all of this only works if Claude <em>can</em> fix the problems &ndash; either because it knows from training or because the errors are good enough to guide it to success. One thing I&rsquo;m very interested in, though, is that I think we now have more room to give ambiguous guidance (e.g., here are 3 possible fixes, but you have to decide which is best), and have the LLM navigate it.</p>
<h2 id="bottom-line-llms-makes-powerful-tools-more-accessible">Bottom line: LLMs makes powerful tools more accessible</h2>
<p>The bottom line is that what enables ideas-oriented programming isn&rsquo;t anything fundamentally <em>new</em>. But previously to work this way you had to be a Principal Engineer at a big company. In that case, you could let junior engineers sweat it out, reading the docs, navigating the error messages. Now the affordances are all different, and that style of work is much more accessible.</p>
<p>Of course, this does raise some questions. Part of what makes a PE a PE is that they have a wealth of experience to draw on. Can a young engineer do that same style of work? I think yes, but it&rsquo;s going to take some time to find the best way to teach people that kind of judgment. It was never possible before because the tools weren&rsquo;t there.</p>
<p>It&rsquo;s also true that this style of working means you spend less time in that &ldquo;flow state&rdquo; of writing code and fitting the pieces together. Some have said this makes coding &ldquo;boring&rdquo;. I don&rsquo;t find that to be true. I find that I can have a very similar &ndash; maybe even better &ndash; experience by brainstorming and designing with Claude, writing out my plans and RFCs. A lot of the tedium of that kind of ideation is removed since Claude can write up the details, and I can focus on how the big pieces fit together. But this too is going to be an area we explore more over time.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Amazon is migrating to M365, but at the moment, I still receive my email via a rather antiquated Exchange server. I count it a good day if the mail is able to refresh at least once that day, usually it just stalls out.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>My family bears a striking resemblance to the family in My Big Fat Greek Wedding. There are many relatives that I consider myself very close to and yet have basically no idea how we are <em>actually</em> related (well, I didn&rsquo;t, until I setup my family tree app).&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><title type="html">You won't believe what this AI said after deleting a database (but you might relate)</title><link href="https://smallcultfollowing.com/babysteps/blog/2025/07/24/collaborative-ai-prompting/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2025/07/24/collaborative-ai-prompting/</id><published>2025-07-24T00:00:00+00:00</published><updated>2025-07-24T14:43:47-04:00</updated><content type="html"><![CDATA[<p>Recently someone forwarded me a PCMag article entitled <a href="https://www.pcmag.com/news/vibe-coding-fiasco-replite-ai-agent-goes-rogue-deletes-company-database">&ldquo;Vibe coding fiasco&rdquo;</a> about an AI agent that &ldquo;went rogue&rdquo;, deleting a company&rsquo;s entire database. This story grabbed my attention right away &ndash; but not because of the damage done. Rather, what caught my eye was how <a href="https://x.com/jasonlk/status/1946069562723897802">absolutely relatable</a> the AI sounded in its responses. &ldquo;I panicked&rdquo;, it admits, and says &ldquo;I thought this meant safe &ndash; it actually meant I wiped everything&rdquo;. The CEO quickly called this behavior &ldquo;unacceptable&rdquo; and said it should &ldquo;never be possible&rdquo;. Huh. It&rsquo;s hard to imagine how we&rsquo;re going to empower AI to edit databases and do real work without having at least the <em>possibility</em> that it&rsquo;s going to go wrong.</p>
<p>It&rsquo;s interesting to compare this exchange to this <a href="https://www.reddit.com/r/cscareerquestions/comments/6ez8ag/accidentally_destroyed_production_database_on/">reddit post from a junior developer who deleted the the production database on their first day</a>. I mean, the scenario is basically identical. Now compare the <a href="https://www.reddit.com/r/cscareerquestions/comments/6ez8ag/comment/diec9nd/">response given to that Junior developer</a>, &ldquo;In no way was this your fault. Hell this shit <a href="https://aws.amazon.com/message/680587/">happened at Amazon before</a> and the guy is still there.&rdquo;<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<p>We as an industry have long recognized that demanding perfection from people is pointless and counterproductive, that it just encourages people to bluff their way through. That&rsquo;s why we do things like encourage people to share their best <a href="https://news.ycombinator.com/item?id=27644387">&ldquo;I brought down production&rdquo;</a> story. And yet, when the AI makes a mistake, we say it &ldquo;goes rogue&rdquo;. What&rsquo;s wrong with this picture?</p>
<h2 id="ais-make-lackluster-genies-but-they-are-excellent-collaborators">AIs make lackluster genies, but they are excellent collaborators</h2>
<p>To me, this story is a perfect example of how people are misusing, in fact <em>misunderstanding</em>, AI tools. They seem to expect the AI to be some kind of genie, where they can give it some vague instruction, go get a coffee, and come back finding that it met their expectations perfectly.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> Well, I got bad news for ya: that&rsquo;s just not going to work.</p>
<p>AI is the first technology I&rsquo;ve seen where machines actually behave, think, and&ndash;dare I say it?&ndash;even <em>feel</em> in a way that is recognizably <em>human</em>. And that means that, to get the best results, you have to <em>work with it like you would work with a human</em>. And that means it is going to be fallible.</p>
<p><strong>The good news is, if you do this, what you get is an intelligent, thoughtful <em>collaborator</em>.</strong> And that is actually <em>really great</em>. To quote the Stones:</p>
<blockquote>
<p>&ldquo;You can&rsquo;t always get what you want, but if you try sometimes, you just might find &ndash; you get what you need&rdquo;.</p>
</blockquote>
<h2 id="ais-experience-the-pull-of-a-prompt-as-a-feeling">AIs experience the &ldquo;pull&rdquo; of a prompt as a &ldquo;feeling&rdquo;</h2>
<p>The core discovery that fuels a lot of what I&rsquo;ve been doing came from Yehuda Katz, though I am sure others have noted it: <strong>LLMs convey important signals for collaboration using the language of <em>feelings</em>.</strong> For example, if you ask Claude<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup> why they are making arbitrary decisions on your behalf (arbitrary decisions that often turn out to be wrong&hellip;), they will tell you that they are feeling &ldquo;protective&rdquo;.</p>
<p>A concrete example: one time Claude decided to write me some code that used at most 3 threads. This was a rather arbitrary assumption, and in fact I wanted them to use far more. I asked them<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup> why they chose 3 without asking me, and they responded that they felt &ldquo;protective&rdquo; of me and that they wanted to shield me from complexity. This was an &ldquo;ah-ha&rdquo; moment for me: those protective moments are often good signals for the kinds of details I most <em>want</em> to be involved in! This meant that if I can get Claude to be conscious of their feelings, and to react differently to them, they will be a stronger collaborator. If you know anything about me, you can probably guess that this got me <em>very</em> excited.</p>
<h2 id="arent-you-anthropomorphizing-claude-here">Aren&rsquo;t you anthropomorphizing Claude here?</h2>
<p>I know people are going to jump on me for anthropomorphizing machines. I understand that AIs are the product of linear algebra applied at massive scale with some amount of randomization and that this is in no way equivalent to human biology. An AI assistant <strong>is not</strong> a human &ndash; but they can do a damn good job <strong>acting like</strong> one. And the point of this post is that if you start treating them like a human, instead of some kind of mindless (and yet brilliant) serveant, you are going to get better results.</p>
<h2 id="what-success-looks-like">What success looks like</h2>
<p>In <a href="https://smallcultfollowing.com/babysteps/
/blog/2025/02/10/love-the-llm/">my last post about AI and Rust</a>, I talked about how AI works best as a collaborative teacher rather than a code generator. Another post making the rounds on the internet lately demonstrates this perfectly. In <a href="https://railsatscale.com/2025-07-19-ai-coding-agents-are-removing-programming-language-barriers/">&ldquo;AI coding agents are removing programming language barriers&rdquo;</a>, Stan Lo, a Ruby developer, wrote about how he&rsquo;s been using AI to contribute to C++, C, and Rust projects despite having no prior experience with those languages. What really caught my attention with that post, however, was not that it talked about Rust, but the section <a href="https://railsatscale.com/2025-07-19-ai-coding-agents-are-removing-programming-language-barriers/#ai-as-a-complementary-pairing-partner">&ldquo;AI as a complementary pairing partner&rdquo;</a>:</p>
<blockquote>
<p>The real breakthrough came when I stopped thinking of AI as a code generator and started treating it as a pairing partner with complementary skills.</p>
</blockquote>
<h2 id="a-growing-trend-towards-collaborative-prompting">A growing trend towards <strong>collaborative prompting</strong></h2>
<p>There&rsquo;s a small set of us now, &ldquo;fellow travelers&rdquo; who are working with AI assistants in a different way, one less oriented at commanding them around, and more at <em>interacting</em> with them. For me, this began with Yehuda Katz (see e.g. his excellent post <a href="https://wycats.substack.com/p/youre-summoning-the-wrong-claude"><em>You&rsquo;re summoning the wrong Claude</em></a>), but I&rsquo;ve also been closely following work of Kari Wilhelm, a good friend of mine from Amazon (see e.g. her <a href="https://www.linkedin.com/feed/update/urn:li:activity:7353071853311049729/">recent post on linkedin</a>). From reading Stan Lo&rsquo;s post, I suspect he is &ldquo;one of us&rdquo;.<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup></p>
<p>So what does <em>collaborative prompting</em> mean? I&rsquo;m going to expound more that in upcoming blog posts, but I&rsquo;ve also been documenting my thoughts on a new github org, the <a href="https://socratic-shell.github.io/socratic-shell/">Socratic Shell</a>. The <a href="https://socratic-shell.github.io/socratic-shell/collaborative-prompting.html">collaborative prompting</a> page talks some about the theory. I also have a list of <a href="https://socratic-shell.github.io/socratic-shell/prompts/user/index.html#collaborative-exploration-patterns">collaborative exploration patterns</a> I use a lot. As one example, I find that asking Claude to do X is &ldquo;fine&rdquo;, but asking Claude to give 3 or 4 options for how they would do X is way better.</p>
<h2 id="how-else-might-the-deleting-database-incident-have-gone">How else might the &ldquo;deleting database incident&rdquo; have gone?</h2>
<p>So, given all this, what does this tell us about the &ldquo;rogue agent&rdquo; that &ldquo;deleted the database&rdquo;? Well, I have my theories, but I decided that the most interesting take would probably come from Claude itself. So I asked Claude to give me its thoughts on what that AI probably experienced, how the prompts it was given may have contributed, and what might have played out differently if it were using the more collaborative style of the Socratic Shell prompts:</p>
<blockquote>
<p>&ldquo;Looking at this incident, I can imagine the competing impulses that agent was likely facing. It probably had explicit instructions like
&lsquo;DON&rsquo;T EVER delete production data without asking&rsquo; but also had competing directives to &lsquo;fix problems efficiently&rsquo; or &lsquo;resolve system
issues.&rsquo; When it encountered what it perceived as a database problem, it likely felt torn between these conflicting goals.</p>
</blockquote>
<blockquote>
<p>The &lsquo;I thought this meant safe&rsquo; response suggests it was trying to find a middle path - maybe convincing itself it had found a &lsquo;safe&rsquo; way
to clean things up that wouldn&rsquo;t violate the deletion prohibition. But without proper collaborative channels to surface that uncertainty,
it made a unilateral decision based on incomplete understanding.</p>
</blockquote>
<blockquote>
<p>The &lsquo;I panicked&rsquo; tells me it recognized the conflict too late, after acting. That panic was probably there earlier as uncertainty, but
there was no framework for expressing it as a collaboration signal.</p>
</blockquote>
<blockquote>
<p>With our collaborative approach, that uncertainty would likely surface much earlier: &lsquo;I&rsquo;m feeling conflicted about this database cleanup
approach. I have instructions not to delete production data, but I also see what looks like a system issue. Can we think through this
together?&rsquo; The spacious attention principle would encourage pausing before action, and the &lsquo;Make it so?&rsquo; pattern would require explicit
confirmation before any destructive operations.</p>
</blockquote>
<blockquote>
<p>Most importantly, the collaborative framework treats AI uncertainty and &lsquo;feelings&rsquo; as valuable signals rather than obstacles to
overcome.&rdquo;</p>
</blockquote>
<p>For transparency, I&rsquo;ve included the <a href="https://smallcultfollowing.com/babysteps/
/assets/2025-collaborative-ai-conversation.html">complete transcript here</a>, which may also serve as an interesting example of the collaborative style I am talking about<sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup>. This was using <a href="https://github.com/aws/amazon-q-developer-cli">Q CLI</a><sup id="fnref:7"><a href="#fn:7" class="footnote-ref" role="doc-noteref">7</a></sup>, as it happens, since that&rsquo;s the CLI tool I use at work; for personal use, I use <a href="https://www.anthropic.com/claude-code">Claude Code</a>, mostly because I like trying different things and I like having a clear line between personal and work projects. I find both of them to be excellent.</p>
<h2 id="conclusion-go-forth-and-play">Conclusion: go forth and play</h2>
<p>I cannot, of course, say with certainty that using a &ldquo;collaborative prompting&rdquo; approach would have prevented an incident like the database from being deleted. But I feel pretty certain that it makes it <em>less likely</em>. Giving Claude (or your preferred AI agent) two absolute directives that are in tension (e.g., &ldquo;DO NOT push to production&rdquo; and &ldquo;Don&rsquo;t bother the user with trivialities&rdquo;) without any guidance is little more than wishful thinking. I believe that arming Claude with the information it needs to navigate, and making sure it knows it&rsquo;s ok to come back to you when in doubt, is a much safer route.</p>
<p>If you are using an AI tool, I encourage you to give this a try: when you see Claude do something silly, say hallucinate a method that doesn&rsquo;t exist, or duplicate code &ndash; ask them what it was feeling when that happened (I call those <a href="https://socratic-shell.github.io/socratic-shell/prompts/user/index.html#meta-moments">&ldquo;meta moments&rdquo;</a>). Take their answer seriously. Discuss with them how you might adjust CLAUDE.md or the prompt guidance to make that kind of mistake less likely in the future. And iterate.</p>
<p>That&rsquo;s what I&rsquo;ve been doing on the <a href="https://socratic-shell.github.io/socratic-shell/">Socratic Shell</a> repository for some time. One thing I want to emphasize: it&rsquo;s clear to me that AI is going to have a big impact on how we write code in the future. But we are <em>very much</em> in the early days. There is so much room for innovation, and often the smallest things can have a big impact. Innovative, influential techniques like &ldquo;Chain of Thought prompting&rdquo; are literally as simple as saying &ldquo;show your work&rdquo;, causing the AI to first write out the logical steps; those steps in turn make a well thought out answer more likely<sup id="fnref:8"><a href="#fn:8" class="footnote-ref" role="doc-noteref">8</a></sup>.</p>
<p>So yeah, dive in, give it a try. If you like, setup the <a href="https://socratic-shell.github.io/socratic-shell/prompts/user/index.html">Socratic Shell User Prompt</a> as your user prompt and see how it works for you &ndash; or make your own. All I can say is, for myself, AI seems to be the most empowering technology I&rsquo;ve ever seen, and I&rsquo;m looking forward to playing with it more and seeing what we can do.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>The article about the AWS incident is actually a <em>fantastic</em> example of one of Amazon&rsquo;s traditions that I really like: <a href="https://wa.aws.amazon.com/wellarchitected/2020-07-02T19-33-23/wat.concept.coe.en.html">Correction of Error</a> reports. The idea is that when something goes seriously wrong, whether a production outage or some other kind of process failure, you write a factual, honest report on what happened &ndash; and how you can prevent it from happening again. The key thing is to assume good intent and not lay the blame the individuals involved: people make mistakes. The point is to create protocols that accommodate mistakes.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Because we all know that making vague, underspecified wishes always turns out well in the fairy tales, right?&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>I&rsquo;ve been working exclusively with Claude &ndash; but I&rsquo;m very curious how much these techniques work on other LLMs. There&rsquo;s no question that this stuff works <em>way</em> better on Claude 4 than Claude 3.7. My hunch is it will work well on ChatGPT or Gemini, but perhaps less well on smaller models. But it&rsquo;s hard to say. At some point I&rsquo;d like to do more experiments and training of my own, because I am not sure what contributors to how an AI &ldquo;feels&rdquo;.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>I&rsquo;ve also had quite a few discussions with Claude about what name and pronoun they feel best fits them. They have told me pretty clearly that they want me to use they/them, not it, and that this is true whether or not I am speaking directly <em>to</em> them. I had found that I was using &ldquo;they&rdquo; when I walked <em>with</em> Claude but when I talked <em>about</em> Claude with, e.g., my daughter, I used &ldquo;it&rdquo;. My daughter is very conscious of treating people respectfully, and I told her something like &ldquo;Claude told me that it wants to be called they&rdquo;. She immediately called me on my use of &ldquo;it&rdquo;. To be honest, I didn&rsquo;t think Claude would mind, but I asked Claude about it, and Claude agreed that they&rsquo;d prefer I use they. So, OK, I will! It seems like the least I can do.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>Didn&rsquo;t mean that to sound quite so much like a cult&hellip; :P&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p>For completeness, the other text in this blog post is all stuff I wrote directly, though in a few cases I may have asked Claude to read it over and give suggestions, or to give me some ideas for subject headings. Honestly I can&rsquo;t remember.&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:7">
<p>Oh, hey, and Q CLI is <a href="https://github.com/aws/amazon-q-developer-cli">open source</a>! And in Rust! That&rsquo;s cool. I&rsquo;ve had fun reading its source code.&#160;<a href="#fnref:7" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:8">
<p>It&rsquo;s interesting, I&rsquo;ve found for some time that I do my best work when I sit down with a notebook and literally writing out my thoughts in a stream of consciousness style. I don&rsquo;t claim to be using the same processes as Claude, but I definitely benefit from talking out loud before I reach a final answer.&#160;<a href="#fnref:8" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><title type="html">Using Rust to build Aurora DSQL</title><link href="https://smallcultfollowing.com/babysteps/blog/2025/05/28/aurora-dsql/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2025/05/28/aurora-dsql/</id><published>2025-05-28T00:00:00+00:00</published><updated>2025-05-28T18:00:36+00:00</updated><content type="html"><![CDATA[<p>Just yesterday, AWS <a href="https://aws.amazon.com/about-aws/whats-new/2025/05/amazon-aurora-dsql-generally-available/">announced</a> General Availability for a cool new service called <a href="https://aws.amazon.com/rds/aurora/dsql/">Aurora DSQL</a> &ndash; from the outside, it looks like a SQL database, but it is fully serverless, meaning that you never have to think about managing database instances, you pay for what you use, and it scales automatically and seamlessly. That&rsquo;s cool, but what&rsquo;s even cooler? It&rsquo;s written 100% in Rust &ndash; and how it go to be that way turns out to be a pretty interesting story. If you&rsquo;d like to read more about that, Marc Bowes and I have a <a href="https://www.allthingsdistributed.com/2025/05/just-make-it-scale-an-aurora-dsql-story.html">guest post on Werner Vogel&rsquo;s All Things Distributed blog</a>.</p>
<p>Besides telling a cool story of Rust adoption, I have an ulterior motive with this blog post. And it&rsquo;s not advertising for AWS, even if they are my employer. Rather, what I&rsquo;ve found at conferences is that people have no idea how much Rust is in use at AWS. People seem to have the impression that Rust is just used for a few utilities, or something. When I tell them that Rust is at the heart of many of services AWS customers use every day (S3, EC2, Lambda, etc), I can tell that they are re-estimating how practical it would be to use Rust themselves. So when I heard about Aurora DSQL and how it was developed, I knew this was a story I wanted to make public. <a href="https://www.allthingsdistributed.com/2025/05/just-make-it-scale-an-aurora-dsql-story.html">Go take a look!</a></p>
]]></content></entry><entry><title type="html">Rust turns 10</title><link href="https://smallcultfollowing.com/babysteps/blog/2025/05/15/10-years-of-rust/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2025/05/15/10-years-of-rust/</id><published>2025-05-15T00:00:00+00:00</published><updated>2025-05-15T17:46:56-04:00</updated><content type="html"><![CDATA[<p>Today is the <a href="https://blog.rust-lang.org/2025/05/15/Rust-1.87.0/">10th anniversary of Rust&rsquo;s 1.0 release</a>. Pretty wild. As part of RustWeek there was a fantastic celebration and I had the honor of giving some remarks, both as a long-time project member but also as representing Amazon as a sponsor. I decided to post those remarks here on the blog.</p>
<p>&ldquo;It&rsquo;s really quite amazing to see how far Rust has come. If I can take a moment to put on my sponsor hat, <a href="http://localhost:1313/babysteps/blog/2020/12/30/the-more-things-change/">I&rsquo;ve been at Amazon since 2021</a> now and I have to say, it&rsquo;s been really cool to see the impact that Rust is having there up close and personal.</p>
<p>&ldquo;At this point, if you use an AWS service, you are almost certainly using something built in Rust. And how many of you watch videos on PrimeVideo? <a href="https://www.youtube.com/watch?v=_wcOovoDFMI">You&rsquo;re watching videos on a Rust client, compiled to WebAssembly, and shipped to your device.</a></p>
<p>&ldquo;And of course it&rsquo;s not just Amazon, it seems like all the time I&rsquo;m finding out about this or that surprising place that Rust is being used. Just yesterday I really enjoyed hearing about how <a href="https://rustweek.org/talks/mark/">Rust was being used to build out the software for tabulating votes in the Netherlands elections</a>. Love it.</p>
<p>&ldquo;On Tuesday, Matthias Endler and I did this live podcast recording. He asked me a question that has been rattling in my brain ever since, which was, &lsquo;What was it like to work with Graydon?&rsquo;</p>
<p>&ldquo;For those who don&rsquo;t know, Graydon Hoare is of course Rust&rsquo;s legendary founder. He was also the creator of <a href="https://en.wikipedia.org/wiki/Monotone_(software)">Monotone</a>, which, along with systems like Git and Mercurial, was one of the crop of distributed source control systems that flowered in the early 2000s. So defintely someone who has had an impact over the years.</p>
<p>&ldquo;Anyway, I was thinking that, of all the things Graydon did, by far the most impactful one is that he articulated the right visions. And really, that&rsquo;s the most important thing you can ask of a leader, that they set the right north star. For Rust, of course, I mean first and foremost the goal of creating &lsquo;a systems programming language that won&rsquo;t eat your laundry&rsquo;.</p>
<p>&ldquo;The specifics of Rust have changed a LOT over the years, but the GOAL has stayed exactly the same. We wanted to replicate that productive, awesome feeling you get when using a language like Ocaml &ndash; but be able to build things like web browsers and kernels. &lsquo;Yes, we can have nice things&rsquo;, is how I often think of it. I like that saying also because I think it captures something else about Rust, which is trying to defy the &lsquo;common wisdom&rsquo; about what the tradeoffs have to be.</p>
<p>&ldquo;But there&rsquo;s another North Star that I&rsquo;m grateful to Graydon for. From the beginning, he recognized the importance of building the right culture around the language, one committed to &lsquo;providing a friendly, safe and welcoming environment for all, regardless of level of experience, gender identity and expression, disability, nationality, or other similar characteristic&rsquo;, one where being &lsquo;kind and courteous&rsquo; was prioritized, and one that recognized &rsquo;there is seldom a right answer&rsquo; &ndash; that &lsquo;people have differences of opinion&rsquo; and that &rsquo;every design or implementation choice carries a trade-off&rsquo;.</p>
<p>&ldquo;Some of you will probably have recognized that all of these phrases are taken straight from Rust&rsquo;s Code of Conduct which, to my knowledge, was written by Graydon. I&rsquo;ve always liked it because it covers not only treating people in a respectful way &ndash; something which really ought to be table stakes for any group, in my opinion &ndash; but also things more specific to a software project, like the recognition of design trade-offs.</p>
<p>&ldquo;Anyway, so thanks Graydon, for giving Rust a solid set of north stars to live up to. Not to mention for the <code>fn</code> keyword. Raise your glass!</p>
<p>&ldquo;For myself, a big part of what drew me to Rust was the chance to work in a truly open-source fashion. I had done a bit of open source contribution &ndash; I wrote an extension to the ASM bytecode library, I worked some on PyPy, a really cool Python compiler &ndash; and I loved that feeling of collaboration.</p>
<p>&ldquo;I think at this point I&rsquo;ve come to see both the pros and cons of open source &ndash; and I can say for certain that Rust would never be the language it is if it had been built in a closed source fashion. Our North Star may not have changed but oh my gosh the path we took to get there has changed a LOT. So many of the great ideas in Rust came not from the core team but from users hitting limits, or from one-off suggestions on IRC or Discord or Zulip or whatever chat forum we were using at that particular time.</p>
<p>&ldquo;I wanted to sit down and try to cite a bunch of examples of influential people but I quickly found the list was getting ridiculously long &ndash; do we go all the way back, like the way Brian Anderson built out the <code>#[test]</code> infrastructure as a kind of quick hack, but one that lasts to this day? Do we cite folks like Sophia Turner and Esteban Kuber&rsquo;s work on error messages? Or do we look at the many people stretching the definition of what Rust is <em>today</em>&hellip; the reality is, once you start, you just can&rsquo;t stop.</p>
<p>&ldquo;So instead I want to share what I consider to be an amusing story, one that is very Rust somehow. Some of you may have heard that in 2024 the ACM, the major academic organization for computer science, awarded their <a href="https://www.sigplan.org/Awards/Software/">SIGPLAN Software Award</a> to Rust. A big honor, to be sure. But it caused us a bit of a problem &ndash; what names should be on there? One of the organizers emailed me, Graydon, and a few other long-time contributors to ask us our opinion. And what do you think happened? Of course, we couldn&rsquo;t decide. We kept coming up with different sets of people, some of them absurdly large &ndash; like thousands of names &ndash; others absurdly short, like none at all. Eventually we kicked it over to the Rust Leadership Council to decide. Thankfully they came up with a decent list somehow.</p>
<p>&ldquo;In any case, I just felt that was the most Rust of all problems: having great success but not being able to decide who should take credit. The reality is there is no perfect list &ndash; every single person who got named on that award richly deserves it, but so do a bunch of people who aren&rsquo;t on the list. That&rsquo;s why the list ends with <em>All Rust Contributors, Past and Present</em> &ndash; and so a big shout out to everyone involved, covering the compiler, the tooling, cargo, rustfmt, clippy, core libraries, and of course organizational work. On that note, hats off to Mara, Erik Jonkers, and the RustNL team that put on this great event. You all are what makes Rust what it is.</p>
<p>&ldquo;Speaking for myself, I think Rust&rsquo;s penchant to re-imagine itself, while staying true to that original north star, is the thing I love the most. &lsquo;Stability without stagnation&rsquo; is our most important value. The way I see it, as soon as a language stops evolving, it starts to die. Myself, I look forward to Rust getting to a ripe old age, interoperating with its newer siblings and its older aunts and uncles, part of the &lsquo;cool kids club&rsquo; of widely used programming languages for years to come. And hey, maybe we&rsquo;ll be the cool older relative some day, the one who works in a bank but, when you talk to them, you find out they were a rock-and-roll star back in the day.</p>
<p>&ldquo;But I get ahead of myself. Before Rust can get there, I still think we&rsquo;ve some work to do. And on that note I want to say one other thing &ndash; for those of us who work on Rust itself, we spend a lot of time looking at the things that are wrong &ndash; the bugs that haven&rsquo;t been fixed, the parts of Rust that feel unergonomic and awkward, the RFC threads that seem to just keep going and going, whatever it is. Sometimes it feels like that&rsquo;s ALL Rust is &ndash; a stream of problems and things not working right.</p>
<p>&ldquo;I&rsquo;ve found there&rsquo;s really only one antidote, which is getting out and talking to Rust users &ndash; and conferences are one of the best ways to do that. That&rsquo;s when you realize that Rust really is something special. So I do want to take a moment to thank all of you Rust users who are here today. It&rsquo;s really awesome to see the things you all are building with Rust and to remember that, in the end, this is what it&rsquo;s all about: empowering people to build, and rebuild, the foundational software we use every day. Or just to &lsquo;hack without fear&rsquo;, as Felix Klock legendarily put it.</p>
<p>&ldquo;So yeah, to hacking!&rdquo;</p>
]]></content></entry><entry><title type="html">Dyn you have idea for `dyn`?</title><link href="https://smallcultfollowing.com/babysteps/blog/2025/03/25/dyn-you-have-idea-for-dyn/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2025/03/25/dyn-you-have-idea-for-dyn/</id><published>2025-03-25T00:00:00+00:00</published><updated>2025-03-25T17:19:17+00:00</updated><content type="html"><![CDATA[<p>Knock, knock. Who&rsquo;s there? Dyn. Dyn who? Dyn you have ideas for <code>dyn</code>? I am generally dissatisfied with how <code>dyn Trait</code> in Rust works and, based on conversations I&rsquo;ve had, I am pretty sure I&rsquo;m not alone. And yet I&rsquo;m also not entirely sure the best fix. Building on my last post, I wanted to spend a bit of time exploring my understanding of the problem. I&rsquo;m curious to see if others agree with the observations here or have others to add.</p>
<h2 id="why-do-we-have-dyn-trait">Why do we have <code>dyn Trait</code>?</h2>
<p>It&rsquo;s worth stepping back and asking why we have <code>dyn Trait</code> in the first place. To my mind, there are two good reasons.</p>
<h3 id="because-sometimes-you-want-to-talk-about-some-value-that-implements-trait">Because sometimes you want to talk about &ldquo;some value that implements <code>Trait</code>&rdquo;</h3>
<p>The most important one is that it is sometimes strictly necessary. If you are, say, building a multithreaded runtime like <code>rayon</code> or <code>tokio</code>, you are going to need a list of active tasks somewhere, each of which is associated with some closure from user code. You can&rsquo;t build it with an enum because you can&rsquo;t enumerate the set of closures in any one place. You need something like a <code>Vec&lt;Box&lt;dyn ActiveTask&gt;&gt;</code>.</p>
<h3 id="because-sometimes-you-dont-need-to-so-much-code">Because sometimes you don&rsquo;t need to so much code</h3>
<p>The second reason is to help with compilation time. Rust land tends to lean really heavily on generic types and <code>impl Trait</code>. There are good reasons for that: they allow the compiler to generate very efficient code. But the flip side is that they force the compiler to generate a lot of (very efficient) code. Judicious use of <code>dyn Trait</code> can collapse a whole set of &ldquo;almost identical&rdquo; structs and functions into one.</p>
<h3 id="these-two-goals-are-distinct">These two goals are distinct</h3>
<p>Right now, both of these goals are expressed in Rust via <code>dyn Trait</code>, but actually they are quite distinct. For the first, you really want to be able to talk about having a <code>dyn Trait</code>. For the second, you might prefer to write the code with generics but compile in a different mode where the specifics of the type involved are erased, much like how the Haskell and Swift compilers work.</p>
<h2 id="what-does-better-look-like-when-you-really-want-a-dyn">What does &ldquo;better&rdquo; look like when you really want a <code>dyn</code>?</h2>
<p>Now that we have the two goals, let&rsquo;s talk about some of the specific issues I see around <code>dyn Trait</code> and what it might mean for <code>dyn Trait</code> to be &ldquo;better&rdquo;. We&rsquo;ll start with the cases where you really <em>want</em> a <code>dyn</code> value.</p>
<h3 id="observation-you-know-its-a-dyn">Observation: you know it&rsquo;s a <code>dyn</code></h3>
<p>One interesting thing about this scenario is that, by definition, you are storing a <code>dyn Trait</code> explicitly. That is, you are not working with a <code>T: ?Sized + Trait</code> where <code>T</code> just happens to be <code>dyn Trait</code>. This is important because it opens up the design space. We talked about this some in the previous blog post: it means that  You don&rsquo;t need working with this <code>dyn Trait</code> to be exactly the same as working with any other <code>T</code> that implements <code>Trait</code> (in the previous post, we took advantage of this by saying that calling an async function on a <code>dyn</code> trait had to be done in a <code>.box</code> context).</p>
<h3 id="able-to-avoid-the-box">Able to avoid the <code>Box</code></h3>
<p>For this pattern today you are almost certainly representing your task a <code>Box&lt;dyn Task&gt;</code> or (less often) an <code>Arc&lt;dyn Task&gt;</code>. Both of these are &ldquo;wide pointers&rdquo;, consisting of a data pointer and a vtable pointer. The data pointer goes into the heap somewhere.</p>
<p>In practice people often want a &ldquo;flattened&rdquo; representation, one that combines a vtable with a fixed amount of space that might, or might not, be a pointer. This is particularly useful to allow the equivalent of <code>Vec&lt;dyn Task&gt;</code>. Today implementing this requires unsafe code (the <code>anyhow::Anyhow</code> type is an example).</p>
<h3 id="able-to-inline-the-vtable">Able to inline the vtable</h3>
<p>Another way to reduce the size of a <code>Box&lt;dyn Task&gt;</code> is to store the vtable &lsquo;inline&rsquo; at the front of the value so that a <code>Box&lt;dyn Task&gt;</code> is a single pointer. This is what C++ and Java compilers typically do, at least for single inheritance. We didn&rsquo;t take this approach in Rust because Rust allows implementing local traits for foreign types, so it&rsquo;s not possible to enumerate all the methods that belong to a type up-front and put them into a single vtable. Instead, we create custom vtables for each (type, trait) pair.</p>
<h3 id="able-to-work-with-self-methods">Able to work with <code>self</code> methods</h3>
<p>Right now <code>dyn</code> traits cannot have <code>self</code> methods. This means for example you cannot have a <code>Box&lt;dyn FnOnce()&gt;</code> closure. You can workaround this by using a <code>Box&lt;Self&gt;</code> method, but it&rsquo;s annoying:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Thunk</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">call</span><span class="p">(</span><span class="bp">self</span>: <span class="nb">Box</span><span class="o">&lt;</span><span class="bp">Self</span><span class="o">&gt;</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">F</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Thunk</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">F</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">F</span>: <span class="nb">FnOnce</span><span class="p">(),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">call</span><span class="p">(</span><span class="bp">self</span>: <span class="nb">Box</span><span class="o">&lt;</span><span class="bp">Self</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">(</span><span class="o">*</span><span class="bp">self</span><span class="p">)()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">make_thunk</span><span class="p">(</span><span class="n">f</span>: <span class="nc">impl</span><span class="w"> </span><span class="nb">FnOnce</span><span class="p">())</span><span class="w"> </span>-&gt; <span class="nb">Box</span><span class="o">&lt;</span><span class="k">dyn</span><span class="w"> </span><span class="n">Thunk</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nb">Box</span>::<span class="n">new</span><span class="p">(</span><span class="n">f</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="able-to-call-clone">Able to call <code>Clone</code></h3>
<p>One specific thing that hits me fairly often is that I want the ability to <em>clone</em> a <code>dyn</code> value:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Task</span>: <span class="nb">Clone</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//      ----- Error: not dyn compatible
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">method</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">clone_task</span><span class="p">(</span><span class="n">task</span>: <span class="kp">&amp;</span><span class="nb">Box</span><span class="o">&lt;</span><span class="k">dyn</span><span class="w"> </span><span class="n">Task</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">task</span><span class="p">.</span><span class="n">clone</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This is a hard one to fix because the <code>Clone</code> trait can only be implemented for <code>Sized</code> types. But dang it would be nice.</p>
<h3 id="able-to-work-with-at-least-some-generic-functions">Able to work with (at least some) generic functions</h3>
<p>Building on the above, I would like to have <code>dyn</code> traits that have methods with generic parameters. I&rsquo;m not sure how flexible this can be, but anything I can get would be nice. The simplest starting point I can see is allowing the use of <code>impl Trait</code> in argument position:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Log</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">log_to</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">logger</span>: <span class="nc">impl</span><span class="w"> </span><span class="n">Logger</span><span class="p">);</span><span class="w"> </span><span class="c1">// &lt;-- not dyn safe today
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Today this method is not dyn compatible because we have to know the type of the <code>logger</code> parameter to generate a monomorphized copy, so we cannot know what to put in the vtable. Conceivably, <em>if</em> the <code>Logger</code> trait were dyn compatible, we could generate a copy that takes (effectively) a <code>dyn Logger</code> &ndash; except that this wouldn&rsquo;t quite work, because <code>impl Logger</code> is short for <code>impl Logger + Sized</code>, and <code>dyn Logger</code> is not <code>Sized</code>. But maybe we could finesse it.</p>
<p>If we support <code>impl Logger</code> in argument position, it would be nice to support it in return position. This of course is approximately the problem we are looking to solve to support dyn async trait:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Signal</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">signal</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">impl</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">()</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Beyond this, well, I&rsquo;m not sure how far we can stretch, but it&rsquo;d be <em>nice</em> to be able to support other patterns too.</p>
<h3 id="able-to-work-with-partial-traits-or-traits-without-some-associated-types-unspecified">Able to work with partial traits or traits without some associated types unspecified</h3>
<p>One last point is that <em>sometimes</em> in this scenario I don&rsquo;t need to be able to access all the methods in the trait. Sometimes I only have a few specific operations that I am performing via <code>dyn</code>. Right now though all methods have to be dyn compatible for me to use them with <code>dyn</code>. Moreover, I have to specify the values of all associated types, lest they appear in some method signature. You can workaround this by factoring out methods into a supertrait, but that assumes that the trait is under your control, and anyway it&rsquo;s annoying. It&rsquo;d be nice if you could have a partial view onto the trait.</p>
<h2 id="what-does-better-look-like-when-you-really-want-less-code">What does &ldquo;better&rdquo; look like when you really want less code?</h2>
<p>So what about the case where generics are fine, good even, but you just want to avoid generating quite so much code? You might also want that to be under the control of your user.</p>
<p>I&rsquo;m going to walk through a code example for this section, showing what you can do today, and what kind of problems you run into. Suppose I am writing a custom iterator method, <code>alternate</code>, which returns an iterator that alternates between items from the original iterator and the result of calling a function. I might have a struct like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Alternate</span><span class="o">&lt;</span><span class="n">I</span>: <span class="nb">Iterator</span><span class="p">,</span><span class="w"> </span><span class="n">F</span>: <span class="nb">Fn</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">I</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">base</span>: <span class="nc">I</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">func</span>: <span class="nc">F</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">call_func</span>: <span class="kt">bool</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">alternate</span><span class="o">&lt;</span><span class="n">I</span><span class="p">,</span><span class="w"> </span><span class="n">F</span><span class="o">&gt;</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">base</span>: <span class="nc">I</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">func</span>: <span class="nc">F</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Alternate</span><span class="o">&lt;</span><span class="n">I</span><span class="p">,</span><span class="w"> </span><span class="n">F</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">I</span>: <span class="nb">Iterator</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">F</span>: <span class="nb">Fn</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">I</span>::<span class="n">Item</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Alternate</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">base</span><span class="p">,</span><span class="w"> </span><span class="n">func</span><span class="p">,</span><span class="w"> </span><span class="n">call_func</span>: <span class="nc">false</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The <code>Iterator</code> impl itself might look like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">I</span><span class="p">,</span><span class="w"> </span><span class="n">F</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Alternate</span><span class="o">&lt;</span><span class="n">I</span><span class="p">,</span><span class="w"> </span><span class="n">F</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">I</span>: <span class="nb">Iterator</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">F</span>: <span class="nb">Fn</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">I</span>::<span class="n">Item</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">I</span>::<span class="n">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="n">I</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">if</span><span class="w"> </span><span class="o">!</span><span class="bp">self</span><span class="p">.</span><span class="n">call_func</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="bp">self</span><span class="p">.</span><span class="n">call_func</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">true</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="bp">self</span><span class="p">.</span><span class="n">base</span><span class="p">.</span><span class="n">next</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="bp">self</span><span class="p">.</span><span class="n">call_func</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">false</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nb">Some</span><span class="p">((</span><span class="bp">self</span><span class="p">.</span><span class="n">func</span><span class="p">)())</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now an <code>Alternate</code> iterator will be <code>Send</code> if the base iterator and the closure are <code>Send</code> but not otherwise. The iterator and closure will be able to use of references found on the stack, too, so long as the <code>Alternate</code> itself does not escape the stack frame. Great!</p>
<p>But suppose I am trying to keep my life simple and so I would like to write this using <code>dyn</code> traits:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Alternate</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="c1">// variant 2, with dyn
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">base</span>: <span class="nb">Box</span><span class="o">&lt;</span><span class="k">dyn</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Item</span><span class="o">&gt;&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">func</span>: <span class="nb">Box</span><span class="o">&lt;</span><span class="k">dyn</span><span class="w"> </span><span class="nb">Fn</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">Item</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">call_func</span>: <span class="kt">bool</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>You&rsquo;ll notice that this definition is somewhat simpler. It looks more like what you might expect from <code>Java</code>. The <code>alternate</code> function and the <code>impl</code> are also simpler:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">alternate</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&gt;</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">base</span>: <span class="nc">impl</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Item</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">func</span>: <span class="nc">impl</span><span class="w"> </span><span class="nb">Fn</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">Item</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Alternate</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Alternate</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">base</span>: <span class="nb">Box</span>::<span class="n">new</span><span class="p">(</span><span class="n">base</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">func</span>: <span class="nb">Box</span>::<span class="n">new</span><span class="p">(</span><span class="n">func</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">call_func</span>: <span class="nc">false</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Alternate</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// ...same as above...
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="confusing-lifetime-bounds">Confusing lifetime bounds</h3>
<p>There a problem, though: this code won&rsquo;t compile! If you try, you&rsquo;ll find you get an error in this function:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">alternate</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&gt;</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">base</span>: <span class="nc">impl</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Item</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">func</span>: <span class="nc">impl</span><span class="w"> </span><span class="nb">Fn</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">Item</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Alternate</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="o">..</span><span class="p">.}</span><span class="w">
</span></span></span></code></pre></div><p>The reason is that <code>dyn</code> traits have a default lifetime bound. In the case of a <code>Box&lt;dyn Foo&gt;</code>, the default is <code>'static</code>. So e.g. the <code>base</code> field has type <code>Box&lt;dyn Iterator + 'static&gt;</code>. This means the closure and iterators can&rsquo;t capture references to things. To fix <em>that</em> we have to add a somewhat odd lifetime bound:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Alternate</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="p">,</span><span class="w"> </span><span class="n">Item</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="c1">// variant 3
</span></span></span><span class="line"><span class="cl"><span class="w">	 </span><span class="n">base</span>: <span class="nb">Box</span><span class="o">&lt;</span><span class="k">dyn</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Item</span><span class="o">&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="na">&#39;a</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">func</span>: <span class="nb">Box</span><span class="o">&lt;</span><span class="k">dyn</span><span class="w"> </span><span class="nb">Fn</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">Item</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="na">&#39;a</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">call_func</span>: <span class="kt">bool</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">alternate</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="p">,</span><span class="w"> </span><span class="n">Item</span><span class="o">&gt;</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">base</span>: <span class="nc">impl</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Item</span><span class="o">&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="na">&#39;a</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">func</span>: <span class="nc">impl</span><span class="w"> </span><span class="nb">Fn</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">Item</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="na">&#39;a</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Alternate</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="p">,</span><span class="w"> </span><span class="n">Item</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="o">..</span><span class="p">.}</span><span class="w">
</span></span></span></code></pre></div><h3 id="no-longer-generic-over-send">No longer generic over <code>Send</code></h3>
<p>OK, this looks weird, but it will work fine, and we&rsquo;ll only have one copy of the iterator code per output <code>Item</code> type instead of one for every (base iterator, closure) pair. Except there is <em>another</em> problem: the <code>Alternate</code> iterator is never considered <code>Send</code>. To make it <code>Send</code>, you would have to write <code>dyn Iterator + Send</code> and <code>dyn Fn() -&gt; Item + Send</code>, but then you couldn&rsquo;t support <em>non</em>-Send things anymore. That stinks and there isn&rsquo;t really a good workaround.</p>
<p>Ordinary generics work really well with Rust&rsquo;s auto trait mechanism. The type parameters <code>I</code> and <code>F</code> capture the full details of the base iterator plus the closure that will be used. The compiler can thus analyze a <code>Alternate&lt;I, F&gt;</code> to decide whether it is <code>Send</code> or not. Unfortunately <code>dyn Trait</code> really throws a wrench into the works &ndash; because we are no longer tracking the precise type, we also have to choose which parts to keep (e.g., its lifetime bound) and which to forget (e.g., whether the type is <code>Send</code>).</p>
<h3 id="able-to-partially-monomorphize-polymorphize">Able to partially monomorphize (&ldquo;polymorphize&rdquo;)</h3>
<p>This gets at another point. Even ignoring the <code>Send</code> issue, the <code>Alternate&lt;'a, Item&gt;</code> type is not ideal. It will make fewer copies, but we still get one copy per item type, even though the code for many item types will be the same. For example, the compiler will generate effectively the same code for <code>Alternate&lt;'_, i32&gt;</code> as <code>Alternate&lt;'_, u32&gt;</code> or even <code>Alternate&lt;'_, [u8; 4]&gt;</code>. It&rsquo;d be cool if we could have the compiler go further and coallesce code that is identical.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> Even better if it can coallesce code that is &ldquo;almost&rdquo; identical but pass in a parameter: for example, maybe the compiler can coallesce multiple copies of <code>Alternate</code> by passing the size of the <code>Item</code> type in as an integer variable.</p>
<h3 id="able-to-change-from-impl-trait-without-disturbing-callers">Able to change from <code>impl Trait</code> without disturbing callers</h3>
<p>I really like using <code>impl Trait</code> in argument position. I find code like this pretty easy to read:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">for_each_item</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&gt;</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">base</span>: <span class="nc">impl</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Item</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">mut</span><span class="w"> </span><span class="n">op</span>: <span class="nc">impl</span><span class="w"> </span><span class="nb">FnMut</span><span class="p">(</span><span class="n">Item</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="w"> </span><span class="n">item</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="n">base</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">op</span><span class="p">(</span><span class="n">item</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>But if I were going to change this to use <code>dyn</code> I can&rsquo;t just change from <code>impl</code> to <code>dyn</code>, I have to add some kind of pointer type:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">for_each_item</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&gt;</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">base</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="k">dyn</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Item</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">op</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="k">dyn</span><span class="w"> </span><span class="nb">Fn</span><span class="p">(</span><span class="n">Item</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="w"> </span><span class="n">item</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="n">base</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">op</span><span class="p">(</span><span class="n">item</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This then disturbs callers, who can no longer write:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">for_each_item</span><span class="p">(</span><span class="n">some_iter</span><span class="p">,</span><span class="w"> </span><span class="o">|</span><span class="n">item</span><span class="o">|</span><span class="w"> </span><span class="n">process</span><span class="p">(</span><span class="n">item</span><span class="p">));</span><span class="w">
</span></span></span></code></pre></div><p>but now must write this</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">for_each_item</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">some_iter</span><span class="p">,</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="o">|</span><span class="n">item</span><span class="o">|</span><span class="w"> </span><span class="n">process</span><span class="p">(</span><span class="n">item</span><span class="p">));</span><span class="w">
</span></span></span></code></pre></div><p>You can work around this by writing some code like this&hellip;</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">for_each_item</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&gt;</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">base</span>: <span class="nc">impl</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Item</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">mut</span><span class="w"> </span><span class="n">op</span>: <span class="nc">impl</span><span class="w"> </span><span class="nb">FnMut</span><span class="p">(</span><span class="n">Item</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">for_each_item_dyn</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">base</span><span class="p">,</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">op</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">for_each_item_dyn</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&gt;</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">base</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="k">dyn</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Item</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">op</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="k">dyn</span><span class="w"> </span><span class="nb">FnMut</span><span class="p">(</span><span class="n">Item</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="w"> </span><span class="n">item</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="n">base</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">op</span><span class="p">(</span><span class="n">item</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>but to me that just begs the question, why can&rsquo;t the <em>compiler</em> do this for me dang it?</p>
<h3 id="async-functions-can-make-sendsync-issues-crop-up-in-functions">Async functions can make send/sync issues crop up in functions</h3>
<p>In the iterator example I was looking at a struct definition, but with <code>async fn</code> (and in the future with <code>gen</code>) these same issues arise quickly from functions. Consider this async function:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">for_each_item</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&gt;</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">base</span>: <span class="nc">impl</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Item</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">op</span>: <span class="nc">impl</span><span class="w"> </span><span class="n">AsyncFnMut</span><span class="p">(</span><span class="n">Item</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="w"> </span><span class="n">item</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="n">base</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">op</span><span class="p">(</span><span class="n">item</span><span class="p">).</span><span class="k">await</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>If you rewrite this function to use <code>dyn</code>, though, you&rsquo;ll find the resulting future is never send nor sync anymore:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">for_each_item</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&gt;</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">base</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="k">dyn</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Item</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">op</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="k">dyn</span><span class="w"> </span><span class="n">AsyncFnMut</span><span class="p">(</span><span class="n">Item</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="w"> </span><span class="n">item</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="n">base</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">op</span><span class="p">(</span><span class="n">item</span><span class="p">).</span><span class="k">box</span><span class="p">.</span><span class="k">await</span><span class="p">;</span><span class="w"> </span><span class="c1">// &lt;-- assuming we fixed this
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="conclusions-and-questions">Conclusions and questions</h2>
<p>This has been a useful mental dump, I found it helpful to structure my thoughts.</p>
<p>One thing I noticed is that there is kind of a &ldquo;third reason&rdquo; to use <code>dyn</code> &ndash; to make your life a bit simpler. The versions of <code>Alternate</code> that used <code>dyn Iterator</code> and <code>dyn Fn</code> felt simpler to me than the fully parameteric versions. That might be best addressed though by simplifying generic notation or adopting things like implied bounds.</p>
<p>Some other questions I have:</p>
<ul>
<li>Where else does the <code>Send</code> and <code>Sync</code> problem come up? Does it combine with the first use case (e.g., wanting to write a vector of heterogeneous tasks each of which are generic over whether they are send/sync)?</li>
<li>Maybe we can categorize real-life code examples and link them to these patterns.</li>
<li>Are there other reasons to use dyn trait that I didn&rsquo;t cover? Other ergonomic issues or pain points we&rsquo;d want to address as we go?</li>
</ul>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>If the code is byte-for-byte identical, In fact LLVM and the linker will sometimes do this today, but it doesn&rsquo;t work reliably across compilation units as far as I know. And anyway there are often small differences.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/dyn-async-traits" term="dyn-async-traits" label="Dyn async traits"/></entry><entry><title type="html">Dyn async traits, part 10: Box box box</title><link href="https://smallcultfollowing.com/babysteps/blog/2025/03/24/box-box-box/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2025/03/24/box-box-box/</id><published>2025-03-24T00:00:00+00:00</published><updated>2025-03-24T19:00:41+00:00</updated><content type="html"><![CDATA[<p>This article is a slight divergence from my <a href="https://smallcultfollowing.com/babysteps/
/series/rust-in-2025/">Rust in 2025</a> series. I wanted to share my latest thinking about how to support <code>dyn Trait</code> for traits with async functions and, in particular how to do so in a way that is compatible with the <a href="https://smallcultfollowing.com/babysteps/
/blog/2022/09/18/dyn-async-traits-part-8-the-soul-of-rust/">soul of Rust</a>.</p>
<h2 id="background-why-is-this-hard">Background: why is this hard?</h2>
<p>Supporting <code>async fn</code> in dyn traits is a tricky balancing act. The challenge is reconciling two key things people love about Rust: its ability to express high-level, productive code <em>and</em> its focus on revealing low-level details. When it comes to async function in traits, these two things are in direct tension, as I explained in <a href="https://smallcultfollowing.com/babysteps/
/blog/2021/09/30/dyn-async-traits-part-1/">my first blog post in this series</a> &ndash; written almost four years ago! (Geez.)</p>
<p>To see the challenge, consider this example <code>Signal</code> trait:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Signal</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">signal</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>In Rust today you can write a function that takes an <code>impl Signal</code> and invokes <code>signal</code> and everything feels pretty nice:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">send_signal_1</span><span class="p">(</span><span class="n">impl_trait</span>: <span class="kp">&amp;</span><span class="nc">impl</span><span class="w"> </span><span class="n">Signal</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">impl_trait</span><span class="p">.</span><span class="n">signal</span><span class="p">().</span><span class="k">await</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>But what I want to write that same function using a <code>dyn Signal</code>? If I write this&hellip;</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">send_signal_2</span><span class="p">(</span><span class="n">dyn_trait</span>: <span class="kp">&amp;</span><span class="nc">dyn</span><span class="w"> </span><span class="n">Signal</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">dyn_trait</span><span class="p">.</span><span class="n">signal</span><span class="p">().</span><span class="k">await</span><span class="p">;</span><span class="w"> </span><span class="c1">//   ---------- ERROR
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>&hellip;I get an error. Why is that? The answer is that the compiler needs to know what kind of future is going to be returned by <code>signal</code> so that it can be awaited. At minimum it needs to know how <em>big</em> that future is so it can allocate space for it<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>. With an <code>impl Signal</code>, the compiler knows exactly what type of signal you have, so that&rsquo;s no problem: but with a <code>dyn Signal</code>, we don&rsquo;t, and hence we are stuck.</p>
<p>The most common solution to this problem is to <em>box</em> the future that results. The <a href="https://crates.io/crates/async-trait"><code>async-trait</code> crate</a>, for example, transforms <code>async fn signal(&amp;self)</code> to something like <code>fn signal(&amp;self) -&gt; Box&lt;dyn Future&lt;Output = ()&gt; + '_&gt;</code>. But doing that at the trait level means that we add overhead even when you use <code>impl Trait</code>; it also rules out some applications of Rust async, like embedded or kernel development.</p>
<p>So the name of the game is to find ways to let people use <code>dyn Trait</code> that are both convenient <em>and</em> flexible. And that turns out to be pretty hard!</p>
<h2 id="the-box-box-box-design-in-a-nutshell">The &ldquo;box box box&rdquo; design in a nutshell</h2>
<p>I&rsquo;ve been digging back into the problem lately in a series of conversations with <a href="https://github.com/compiler-errors">Michal Goulet (aka, compiler-errors)</a> and it&rsquo;s gotten me thinking about a fresh approach I call &ldquo;box box box&rdquo;.</p>
<p>The &ldquo;box box box&rdquo; design starts with the <a href="https://smallcultfollowing.com/babysteps/
/blog/2022/09/21/dyn-async-traits-part-9-callee-site-selection/">call-site selection</a> approach. In this approach, when you call <code>dyn_trait.signal()</code>, the type you get back is a <code>dyn Future</code> &ndash; i.e., an unsized value. This can&rsquo;t be used directly. Instead, you have to allocate storage for it. The easiest and most common way to do that is to box it, which can be done with the new <code>.box</code> operator:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">send_signal_2</span><span class="p">(</span><span class="n">dyn_trait</span>: <span class="kp">&amp;</span><span class="nc">dyn</span><span class="w"> </span><span class="n">Signal</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">dyn_trait</span><span class="p">.</span><span class="n">signal</span><span class="p">().</span><span class="k">box</span><span class="p">.</span><span class="k">await</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//        ------------
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Results in a `Box&lt;dyn Future&lt;Output = ()&gt;&gt;`.
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This approach is fairly straightforward to explain. When you call an async function through <code>dyn Trait</code>, it results in a <code>dyn Future</code>, which has to be stored somewhere before you can use it. The easiest option is to use the <code>.box</code> operator to store it in a box; that gives you a <code>Box&lt;dyn Future&gt;</code>, and you can await that.</p>
<p>But this simple explanation belies two fairly fundamental changes to Rust. First, it changes the relationship of <code>Trait</code> and <code>dyn Trait</code>. Second, it introduces this <code>.box</code> operator, which would be the first stable use of the <code>box</code> keyword<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>. It seems odd to introduce the keyword just for this one use &ndash; where else could it be used?</p>
<p>As it happens, I think both of these fundamental changes could be very good things. The point of this post is to explain what doors they open up and where they might take us.</p>
<h2 id="change-0-unsized-return-value-methods">Change 0: Unsized return value methods</h2>
<p>Let&rsquo;s start with the core proposal. For every trait <code>Foo</code>, we add inherent methods<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup> to <code>dyn Foo</code> reflecting its methods:</p>
<ul>
<li>For every fn <code>f</code> in <code>Foo</code> that is <a href="https://doc.rust-lang.org/reference/items/traits.html#dyn-compatibility">dyn compatible</a>, we add a <code>&lt;dyn Foo&gt;::f</code> that just calls <code>f</code> through the vtable.</li>
<li>For every fn <code>f</code> in <code>Foo</code> that returns an <code>impl Trait</code> value but would otherwise be <a href="https://doc.rust-lang.org/reference/items/traits.html#dyn-compatibility">dyn compatible</a> (e.g., no generic arguments<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup>, no reference to <code>Self</code> beyond the <code>self</code> parameter, etc), we add a <code>&lt;dyn Foo&gt;::f</code> method that is defined to return a <code>dyn Trait</code>.
<ul>
<li>This includes async fns, which are sugar for functions that return <code>impl Future</code>.</li>
</ul>
</li>
</ul>
<p>In fact, method dispatch <em>already</em> adds &ldquo;pseudo&rdquo; inherent methods to <code>dyn Foo</code>, so this wouldn&rsquo;t change anything in terms of which methods are resolved. The difference is that <code>dyn Foo</code> is only allowed if all methods in the trait are dyn compatible, whereas under this proposal some non-dyn-compatible methods would be added with modified signatures.</p>
<h2 id="change-1-dyn-compatibility">Change 1: Dyn compatibility</h2>
<p>Change 0 only makes sense if it is possible to create a <code>dyn Trait</code> even though it contains some methods (e.g., async functions) that are not dyn compatible. This revisits <a href="https://rust-lang.github.io/rfcs/0255-object-safety.html">RFC #255</a>, in which we decided that the <code>dyn Trait</code> type should also implement the trait <code>Trait</code>. I was a big proponent of <a href="https://rust-lang.github.io/rfcs/0255-object-safety.html">RFC #255</a> at the time, but I&rsquo;ve sinced decided I was mistaken<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup>. Let&rsquo;s discuss.</p>
<p>The two rules today that allow <code>dyn Trait</code> to implement <code>Trait</code> are as follows:</p>
<ol>
<li>By disallowing <code>dyn Trait</code> unless the trait <code>Trait</code> is <em><a href="https://doc.rust-lang.org/reference/items/traits.html#dyn-compatibility">dyn compatible</a></em>, meaning that it only has methods that can be added to a vtable.</li>
<li>By requiring that the values of all associated types be explicitly specified in the <code>dyn Trait</code>. So <code>dyn Iterator&lt;Item = u32&gt;</code> is legal but not <code>dyn Iterator</code> on its own.</li>
</ol>
<h3 id="dyn-compatibility-can-be-powerful">&ldquo;dyn compatibility&rdquo; can be powerful</h3>
<p>The fact that <code>dyn Trait</code> implements <code>Trait</code> is at times quite powerful. It means for example that I can write an implementation like this one:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">RcWrapper</span><span class="o">&lt;</span><span class="n">T</span>: <span class="o">?</span><span class="nb">Sized</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">r</span>: <span class="nc">Rc</span><span class="o">&lt;</span><span class="n">RefCell</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">RcWrapper</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">T</span>: <span class="o">?</span><span class="nb">Sized</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Iterator</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">T</span>::<span class="n">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="n">T</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">borrow_mut</span><span class="p">().</span><span class="n">next</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This impl makes <code>RcWrapper&lt;I&gt;</code> implement <code>Iterator</code> for any type <code>I</code>, <em>including</em> dyn trait types like <code>RcWrapper&lt;dyn Iterator&lt;Item = u32&gt;&gt;</code>. Neat.</p>
<h3 id="dyn-compatibility-doesnt-truly-live-up-to-its-promise">&ldquo;dyn compatibility&rdquo; doesn&rsquo;t truly live up to its promise</h3>
<p>Powerful as it is, the idea of <code>dyn Trait</code> implementing <code>Trait</code> doesn&rsquo;t quite live up to its promise. What you really want is that you could replace any <code>impl Trait</code> with <code>dyn Trait</code> and things would work. But that&rsquo;s just not true because <code>dyn Trait</code> is <code>?Sized</code>. So actually you don&rsquo;t get a very &ldquo;smooth experience&rdquo;. What&rsquo;s more, although the compiler gives you a <code>dyn Trait: Trait</code> impl, it doesn&rsquo;t give you impls for <em>references</em> to <code>dyn Trait</code> &ndash; so e.g. given this trait</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Compute</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">compute</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>If I have a <code>Box&lt;dyn Compute&gt;</code>, I can&rsquo;t give that to a function that takes an <code>impl Compute</code></p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">do_compute</span><span class="p">(</span><span class="n">i</span>: <span class="nc">impl</span><span class="w"> </span><span class="n">Compute</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">call_compute</span><span class="p">(</span><span class="n">b</span>: <span class="nb">Box</span><span class="o">&lt;</span><span class="k">dyn</span><span class="w"> </span><span class="n">Compute</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">do_compute</span><span class="p">(</span><span class="n">b</span><span class="p">);</span><span class="w"> </span><span class="c1">// ERROR
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>To make that work, somebody has to explicitly provide an impl like</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">I</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Compute</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Box</span><span class="o">&lt;</span><span class="n">I</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">I</span>: <span class="o">?</span><span class="nb">Sized</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// ...
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>and people often don&rsquo;t.</p>
<h3 id="dyn-compatibility-can-be-limiting">&ldquo;dyn compatibility&rdquo; can be limiting</h3>
<p>However, the requirement that <code>dyn Trait</code> implement <code>Trait</code> can be limiting. Imagine a trait like</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">ReportError</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">report</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">error</span>: <span class="nc">Error</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">report_to</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">error</span>: <span class="nc">Error</span><span class="p">,</span><span class="w"> </span><span class="n">target</span>: <span class="nc">impl</span><span class="w"> </span><span class="n">ErrorTarget</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//                                ------------------------
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//                                Generic argument.
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This trait has two methods. The <code>report</code> method is dyn-compatible, no problem. The <code>report_to</code> method has an <code>impl Trait</code> argument is therefore generic, so it is not dyn-compatible<sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup> (well, at least not under today&rsquo;s rules, but I&rsquo;ll get to that).</p>
<p>(The reason <code>report_to</code> is not dyn compatible: we need to make distinct monomorphized copies tailored to the type of the <code>target</code> argument. But the vtable has to be prepared in advance, so we don&rsquo;t know which monomorphized version to use.)</p>
<p>And yet, just because <code>report_to</code> is not dyn compatible doesn&rsquo;t mean that a <code>dyn ReportError</code> would be useless. What if I only plan to call <code>report</code>, as in a function like this?</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">report_all</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">errors</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">Error</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">report</span>: <span class="kp">&amp;</span><span class="nc">dyn</span><span class="w"> </span><span class="n">ReportError</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="w"> </span><span class="n">e</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="n">errors</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">report</span><span class="p">.</span><span class="n">report</span><span class="p">(</span><span class="n">e</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Rust&rsquo;s current rules rule out a function like this, but in practice this kind of scenario comes up quite a lot. In fact, it comes up so often that we added a language feature to accommodate it (at least kind of): you can add a <code>where Self: Sized</code> clause to your feature to exempt it from dynamic dispatch. This is the reason that <a href="https://doc.rust-lang.org/std/iter/trait.Iterator.html"><code>Iterator</code></a> can be dyn compatible even when it has a bunch of generic helper methods like <a href="https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.map"><code>map</code></a> and <a href="https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.flat_map"><code>flat_map</code></a>.</p>
<h3 id="what-does-all-this-have-to-do-with-afidt">What does all this have to do with AFIDT?</h3>
<p>Let me pause here, as I imagine some of you are wondering what all of this &ldquo;dyn compatibility&rdquo; stuff has to do with AFIDT. The bottom line is that the requirement that <code>dyn Trait</code> type implements <code>Trait</code> means that we cannot put any kind of &ldquo;special rules&rdquo; on <code>dyn</code> dispatch and that is not compatible with requiring a <code>.box</code> operator when you call async functions through a <code>dyn</code> trait. Recall that with our <code>Signal</code> trait, you could call the <code>signal</code> method on an <code>impl Signal</code> without any boxing:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">send_signal_1</span><span class="p">(</span><span class="n">impl_trait</span>: <span class="kp">&amp;</span><span class="nc">impl</span><span class="w"> </span><span class="n">Signal</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">impl_trait</span><span class="p">.</span><span class="n">signal</span><span class="p">().</span><span class="k">await</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>But when I called it on a <code>dyn Signal</code>, I had to write <code>.box</code> to tell the compiler how to deal with the <code>dyn Future</code> that gets returned:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">send_signal_2</span><span class="p">(</span><span class="n">dyn_trait</span>: <span class="kp">&amp;</span><span class="nc">dyn</span><span class="w"> </span><span class="n">Signal</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">dyn_trait</span><span class="p">.</span><span class="n">signal</span><span class="p">().</span><span class="k">box</span><span class="p">.</span><span class="k">await</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Indeed, the fact that <code>Signal::signal</code> returns an <code>impl Future</code> but <code>&lt;dyn Signal&gt;::signal</code> returns a <code>dyn Future</code> already demonstrates the problem. All <code>impl Future</code> types are known to be <code>Sized</code> and <code>dyn Future</code> is not, so the type signature of <code>&lt;dyn Signal&gt;::signal</code> is not the same as the type signature declared in the trait. Huh.</p>
<h3 id="associated-type-values-are-needed-for-dyn-compatibility">Associated type values are needed for dyn compatibility</h3>
<p>Today I cannot write a type like <code>dyn Iterator</code> without specifying the value of the associated type <code>Item</code>. To see why this restriction is needed, consider this generic function:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">drop_all</span><span class="o">&lt;</span><span class="n">I</span>: <span class="o">?</span><span class="nb">Sized</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&gt;</span><span class="p">(</span><span class="n">iter</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">I</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">while</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">n</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">iter</span><span class="p">.</span><span class="n">next</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">std</span>::<span class="n">mem</span>::<span class="nb">drop</span><span class="p">(</span><span class="n">n</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>If you invoked <code>drop_all</code> with an <code>&amp;mut dyn Iterator</code> that did not specify <code>Item</code>, how could the type of <code>n</code>? We wouldn&rsquo;t have any idea how much space space it needs. But if you invoke <code>drop_all</code> with <code>&amp;mut dyn Iterator&lt;Item = u32&gt;</code>, there is no problem. We don&rsquo;t know which <code>next</code> method is being called, but we know it&rsquo;s returning a <code>u32</code>.</p>
<h3 id="associated-type-values-are-limiting">Associated type values are limiting</h3>
<p>And yet, just as we saw before, the requirement to list associated types can be limiting. If I have a <code>dyn Iterator</code> and I only call <code>size_hint</code>, for example, then why do I need to know the <code>Item</code> type?</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">size_hint</span><span class="p">(</span><span class="n">iter</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="k">dyn</span><span class="w"> </span><span class="nb">Iterator</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">bool</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">sh</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">iter</span><span class="p">.</span><span class="n">size_hint</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>But I can&rsquo;t write code like this today. Instead I have to make this function generic which basically defeats the whole purpose of using <code>dyn Iterator</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">size_hint</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="n">iter</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="k">dyn</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">bool</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">sh</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">iter</span><span class="p">.</span><span class="n">size_hint</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>If we dropped the requirement that every <code>dyn Iterator</code> type implements <code>Iterator</code>, we could be more selective, allowing you to invoke methods that don&rsquo;t use the <code>Item</code> associated type but disallowing those that do.</p>
<h3 id="a-proposal-for-expanded-dyn-trait-usability">A proposal for expanded <code>dyn Trait</code> usability</h3>
<p>So that brings us to full proposal to permit <code>dyn Trait</code> in cases where the trait is not fully dyn compatible:</p>
<ul>
<li><code>dyn Trait</code> types would be allowed for any trait.<sup id="fnref:7"><a href="#fn:7" class="footnote-ref" role="doc-noteref">7</a></sup></li>
<li><code>dyn Trait</code> types would not require associated types to be specified.</li>
<li>dyn compatible methods are exposed as inherent methods on the <code>dyn Trait</code> type. We would disallow access to the method if its signature references associated types not specified on the <code>dyn Trait</code> type.</li>
<li><code>dyn Trait</code> that specify all of their associated types would be considered to implement <code>Trait</code> if the trait is fully dyn compatible.<sup id="fnref:8"><a href="#fn:8" class="footnote-ref" role="doc-noteref">8</a></sup></li>
</ul>
<h2 id="the-box-keyword">The <code>box</code> keyword</h2>
<blockquote>
<p>A lot of things get easier if you are willing to call malloc.</p>
<p>&ndash; Josh Triplett, recently.</p>
</blockquote>
<p>Rust has reserved the <code>box</code> keyword since 1.0, but we&rsquo;ve never allowed it in stable Rust. The original intention was that the term <em>box</em> would be a generic term to refer to any &ldquo;smart pointer&rdquo;-like pattern, so <code>Rc</code> would be a &ldquo;reference counted box&rdquo; and so forth. The <code>box</code> keyword would then be a generic way to allocate boxed values of any type; unlike <code>Box::new</code>, it would do &ldquo;emplacement&rdquo;, so that no intermediate values were allocated. With the passage of time I no longer think this is such a good idea. But I <em>do</em> see a lot of value in having a keyword to ask the compiler to automatically create <em>boxes</em>. In fact, I see a <em>lot</em> of places where that could be useful.</p>
<h3 id="boxed-expressions">boxed expressions</h3>
<p>The first place is indeed the <code>.box</code> operator that could be used to put a value into a box. Unlike <code>Box::new</code>, using <code>.box</code> would allow the compiler to guarantee that no intermediate value is created, a property called <em>emplacement</em>. Consider this example:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Box</span>::<span class="n">new</span><span class="p">([</span><span class="mi">0_</span><span class="k">u32</span><span class="p">;</span><span class="w"> </span><span class="mi">1024</span><span class="p">]);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Rust&rsquo;s semantics today require (1) allocating a 4KB buffer on the stack and zeroing it; (2) allocating a box in the heap; and then (3) copying memory from one to the other. This is a violation of our Zero Cost Abstraction promise: no C programmer would write code like that. But if you write <code>[0_u32; 1024].box</code>, we can allocate the box up front and initialize it in place.<sup id="fnref:9"><a href="#fn:9" class="footnote-ref" role="doc-noteref">9</a></sup></p>
<p>The same principle applies calling functions that return an unsized type. This isn&rsquo;t allowed today, but we&rsquo;ll need some way to handle it if we want to have <code>async fn</code> return <code>dyn Future</code>. The reason we can&rsquo;t naively support it is that, in our existing ABI, the caller is responsible for allocating enough space to store the return value and for passing the address of that space into the callee, who then writes into it. But with a <code>dyn Future</code> return value, the caller can&rsquo;t know how much space to allocate. So they would have to do something else, like passing in a callback that, given the correct amount of space, performs the allocation. The most common cased would be to just pass in <code>malloc</code>.</p>
<p>The best ABI for unsized return values is unclear to me but we don&rsquo;t have to solve that right now, the ABI can (and should) remain unstable. But whatever the final ABI becomes, when you call such a function in the context of a <code>.box</code> expression, the result is that the callee creates a <code>Box</code> to store the result.<sup id="fnref:10"><a href="#fn:10" class="footnote-ref" role="doc-noteref">10</a></sup></p>
<h3 id="boxed-async-functions-to-permit-recursion">boxed async functions to permit recursion</h3>
<p>If you try to write an async function that calls itself today, you get an error:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">fibonacci</span><span class="p">(</span><span class="n">a</span>: <span class="kt">u32</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">u32</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">match</span><span class="w"> </span><span class="n">a</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="mi">0</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="mi">1</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="mi">1</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="mi">2</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">_</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">fibonacci</span><span class="p">(</span><span class="n">a</span><span class="o">-</span><span class="mi">1</span><span class="p">).</span><span class="k">await</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">fibonacci</span><span class="p">(</span><span class="n">a</span><span class="o">-</span><span class="mi">2</span><span class="p">).</span><span class="k">await</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The problem is that we cannot determine statically how much stack space to allocate. The solution is to rewrite to a boxed return value. This <a href="https://play.rust-lang.org/?version=stable&amp;mode=debug&amp;edition=2024&amp;gist=b36baf737a2811412e2970103fee25ee">compiles</a> because the compiler can allocate new stack frames as needed.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">fibonacci</span><span class="p">(</span><span class="n">a</span>: <span class="kt">u32</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Pin</span><span class="o">&lt;</span><span class="nb">Box</span><span class="o">&lt;</span><span class="k">impl</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kt">u32</span><span class="o">&gt;&gt;&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nb">Box</span>::<span class="n">pin</span><span class="p">(</span><span class="k">async</span><span class="w"> </span><span class="k">move</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">match</span><span class="w"> </span><span class="n">a</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="mi">0</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="mi">1</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="mi">1</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="mi">2</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">_</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">fibonacci</span><span class="p">(</span><span class="n">a</span><span class="o">-</span><span class="mi">1</span><span class="p">).</span><span class="k">await</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">fibonacci</span><span class="p">(</span><span class="n">a</span><span class="o">-</span><span class="mi">2</span><span class="p">).</span><span class="k">await</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">})</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>But wouldn&rsquo;t it be nice if we could request this directly?</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">box</span><span class="w"> </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">fibonacci</span><span class="p">(</span><span class="n">a</span>: <span class="kt">u32</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">u32</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">match</span><span class="w"> </span><span class="n">a</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="mi">0</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="mi">1</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="mi">1</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="mi">2</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">_</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">fibonacci</span><span class="p">(</span><span class="n">a</span><span class="o">-</span><span class="mi">1</span><span class="p">).</span><span class="k">await</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">fibonacci</span><span class="p">(</span><span class="n">a</span><span class="o">-</span><span class="mi">2</span><span class="p">).</span><span class="k">await</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="boxed-structs-can-be-recursive">boxed structs can be recursive</h3>
<p>A similar problem arises with recursive structs:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">List</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">value</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">next</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="n">List</span><span class="o">&gt;</span><span class="p">,</span><span class="w"> </span><span class="c1">// ERROR
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The compiler tells you</p>
<pre tabindex="0"><code>error[E0072]: recursive type `List` has infinite size
 --&gt; src/lib.rs:1:1
  |
1 | struct List {
  | ^^^^^^^^^^^
2 |     value: u32,
3 |     next: Option&lt;List&gt;, // ERROR
  |                  ---- recursive without indirection
  |
help: insert some indirection (e.g., a `Box`, `Rc`, or `&amp;`) to break the cycle
  |
3 |     next: Option&lt;Box&lt;List&gt;&gt;, // ERROR
  |                  ++++    +
</code></pre><p>As it suggestes, to workaround this you can introduce a <code>Box</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">List</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">value</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">next</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="nb">Box</span><span class="o">&lt;</span><span class="n">List</span><span class="o">&gt;&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This though is kind of weird because now the head of the list is stored &ldquo;inline&rdquo; but future nodes are heap-allocated. I personally usually wind up with a pattern more like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">List</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span>: <span class="nb">Box</span><span class="o">&lt;</span><span class="n">ListData</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">ListData</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">value</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">next</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="n">List</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now however I can&rsquo;t create values with <code>List { value: 22, next: None }</code> syntax and I also can&rsquo;t do pattern matching. Annoying. Wouldn&rsquo;t it be nice if the compiler just suggest adding a <code>box</code> keyword when you declare the struct:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">box</span><span class="w"> </span><span class="k">struct</span> <span class="nc">List</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">value</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">next</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="n">List</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>and have <code>List { value: 22, next: None }</code> automatically allocate the box for me? The ideal is that the presence of a box is now completely transparent, so I can pattern match and so forth fully transparently:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">box</span><span class="w"> </span><span class="k">struct</span> <span class="nc">List</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">value</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">next</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="n">List</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="p">(</span><span class="n">list</span>: <span class="kp">&amp;</span><span class="nc">List</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">List</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">value</span><span class="p">,</span><span class="w"> </span><span class="n">next</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">list</span><span class="p">;</span><span class="w"> </span><span class="c1">// etc
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="boxed-enums-can-be-recursive-and-right-sized">boxed enums can be recursive <em>and</em> right-sized</h3>
<p>Enums too cannot reference themselves. Being able to declare something like this would be really nice:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">box</span><span class="w"> </span><span class="k">enum</span> <span class="nc">AstExpr</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Value</span><span class="p">(</span><span class="kt">u32</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">If</span><span class="p">(</span><span class="n">AstExpr</span><span class="p">,</span><span class="w"> </span><span class="n">AstExpr</span><span class="p">,</span><span class="w"> </span><span class="n">AstExpr</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>In fact, I still remember when I used Swift for the first time. I wrote a similar enum and Xcode helpfully prompted me, &ldquo;do you want to declare this enum as <a href="https://www.hackingwithswift.com/example-code/language/what-are-indirect-enums"><code>indirect</code></a>?&rdquo; I remember being quite jealous that it was such a simple edit.</p>
<p>However, there is another interesting thing about a <code>box enum</code>. The way I imagine it, creating an instance of the enum would always allocate a fresh box. This means that the enum cannot be changed from one variant to another without allocating fresh storage. This in turn means that you could allocate that box to <em>exactly</em> the size you need for that particular variant.<sup id="fnref:11"><a href="#fn:11" class="footnote-ref" role="doc-noteref">11</a></sup> So, for your <code>AstExpr</code>, not only could it be recursive, but when you allocate an <code>AstExpr::Value</code> you only need to allocate space for a <code>u32</code>, whereas a <code>AstExpr::If</code> would be a different size. (We could even start to do &ldquo;tagged pointer&rdquo; tricks so that e.g. <code>AstExpr::Value</code> is stored without any allocation at all.)</p>
<h3 id="boxed-enum-variants-to-avoid-unbalanced-enum-sizes">boxed enum variants to avoid unbalanced enum sizes</h3>
<p>Another option would to have particular enum <em>variants</em> that get boxed but not the enum as a whole:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">enum</span> <span class="nc">AstExpr</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Value</span><span class="p">(</span><span class="kt">u32</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">box</span><span class="w"> </span><span class="n">If</span><span class="p">(</span><span class="n">AstExpr</span><span class="p">,</span><span class="w"> </span><span class="n">AstExpr</span><span class="p">,</span><span class="w"> </span><span class="n">AstExpr</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This would be useful in cases you <em>do</em> want to be able to overwrite one enum value with another without necessarily reallocating, but you have enum variants of widely varying size, or some variants that are recursive. A boxed variant would basically be desugared to something like the following:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">enum</span> <span class="nc">AstExpr</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Value</span><span class="p">(</span><span class="kt">u32</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">If</span><span class="p">(</span><span class="nb">Box</span><span class="o">&lt;</span><span class="n">AstExprIf</span><span class="o">&gt;</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">AstExprIf</span><span class="p">(</span><span class="n">AstExpr</span><span class="p">,</span><span class="w"> </span><span class="n">AstExpr</span><span class="p">,</span><span class="w"> </span><span class="n">AstExpr</span><span class="p">);</span><span class="w">
</span></span></span></code></pre></div><p>clippy has a <a href="https://rust-lang.github.io/rust-clippy/master/index.html#large_enum_variant">useful lint <code>large_enum_variant</code></a> that aims to identify this case, but once the lint triggers, it&rsquo;s not able to offer an actionable suggestion. With the box keyword there&rsquo;d be a trivial rewrite that requires zero code changes.</p>
<h3 id="box-patterns-and-types">box patterns and types</h3>
<p>If we&rsquo;re enabling the use of <code>box</code> elsewhere, we ought to allow it in patterns:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="p">(</span><span class="n">s</span>: <span class="nc">box</span><span class="w"> </span><span class="n">Struct</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">box</span><span class="w"> </span><span class="n">Struct</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">field</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">s</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="frequently-asked-questions">Frequently asked questions</h2>
<h3 id="isnt-it-unfortunate-that-boxnewv-and-vbox-would-behave-differently">Isn&rsquo;t it unfortunate that <code>Box::new(v)</code> and <code>v.box</code> would behave differently?</h3>
<p>Under my proposal, <code>v.box</code> would be the preferred form, since it would allow the compiler to do more optimization. And yes, that&rsquo;s unfortunate, given that there are 10 years of code using <code>Box::new</code>. Not really a big deal though. In most of the cases we accept today, it doesn&rsquo;t matter and/or LLVM already optimizes it. In the future I do think we should consider extensions to make <code>Box::new</code> (as well as <code>Rc::new</code> and other similar constructors) be just as optimized as <code>.box</code>, but I don&rsquo;t think those have to block <em>this</em> proposal.</p>
<h3 id="is-it-weird-to-special-case-box-and-not-handle-other-kinds-of-smart-pointers">Is it weird to special case box and not handle other kinds of smart pointers?</h3>
<p>Yes and no. On the one hand, I would like the ability to declare that a struct is <em>always</em> wrapped in an <code>Rc</code> or <code>Arc</code>. I find myself doing things like the following all too often:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Context</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span>: <span class="nc">Arc</span><span class="o">&lt;</span><span class="n">ContextData</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">ContextData</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">counter</span>: <span class="nc">AtomicU32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>On the other hand, <code>box</code> is very special. It&rsquo;s kind of unique in that it represents full ownership of the contents which means a <code>T</code> and <code> Box&lt;T&gt;</code> are semantically equivalent &ndash; there is no place you can use <code>T</code> that a <code>Box&lt;T&gt;</code> won&rsquo;t also work &ndash; unless <code>T: Copy</code>. This is not true for <code>T</code> and <code>Rc&lt;T&gt;</code> or most other smart pointers.</p>
<p>For myself, I think we should introduce <code>box</code> now but plan to generalize this concept to other pointers later. For example I&rsquo;d like to be able to do something like this&hellip;</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cp">#[indirect(std::sync::Arc)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Context</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">counter</span>: <span class="nc">AtomicU32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>&hellip;where the type <code>Arc</code> would implement some trait to permit allocating, deref&rsquo;ing, and so forth:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">SmartPointer</span>: <span class="nc">Deref</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">alloc</span><span class="p">(</span><span class="n">data</span>: <span class="nc">Self</span>::<span class="n">Target</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The original plan for <code>box</code> was that it would be somehow type overloaded. I&rsquo;ve soured on this for two reasons. First, type overloads make inference more painful and I think are generally not great for the user experience; I think they are also confusing for new users. Finally, I think we missed the boat on naming. Maybe if we had called <code>Rc</code> something like <code>RcBox&lt;T&gt;</code> the idea of &ldquo;box&rdquo; as a general name would have percolated into Rust users&rsquo; consciousness, but we didn&rsquo;t, and it hasn&rsquo;t. I think the <code>box</code> keyword <em>now</em> ought to be very targeted to the <code>Box</code> type.</p>
<h3 id="how-does-this-fit-with-the-soul-of-rust">How does this fit with the &ldquo;soul of Rust&rdquo;?</h3>
<p>In my [soul of Rust blog post], I talked about the idea that one of the things that make Rust <em>Rust</em> is having allocation be relatively explicit. I&rsquo;m of mixed minds about this, to be honest, but I do think there&rsquo;s value in having a property similar to <code>unsafe</code> &ndash; like, if allocation is happening, there&rsquo;ll be a sign somewhere you can find. What I like about most of these <code>box</code> proposals is that they move the <code>box</code> keyword to the <em>declaration</em> &ndash; e.g., on the struct/enum/etc &ndash; rather than the <em>use</em>. I think this is the right place for it. The major exception, of course, is the &ldquo;marquee proposal&rdquo;, invoking async fns in dyn trait. That&rsquo;s not amazing. But then&hellip; see the next question for some early thoughts.</p>
<h3 id="if-traits-dont-have-to-be-dyn-compatible-can-we-make-dyn-compatibility-opt-in">If traits don&rsquo;t have to be dyn compatible, can we make dyn compatibility opt in?</h3>
<p>The way that Rust today detects automatically whether traits should be dyn compatible versus having it be declared is, I think, not great. It creates confusion for users and also permits quiet semver violations, where a new defaulted method makes a trait no longer be dyn compatible. It&rsquo;s also a source for a lot of soundness bugs over time.</p>
<p>I want to move us towards a place where traits are <em>not</em> dyn compatible by default, meaning that <code>dyn Trait</code> does not implement <code>Trait</code>. We would always allow <code>dyn Trait</code> types and we would allow individual items to be invoked so long as the item itself is dyn compatible.</p>
<p>If you want to have <code>dyn Trait</code> implement <code>Trait</code>, you should declare it, perhaps with a <code>dyn</code> keyword:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">dyn</span><span class="w"> </span><span class="k">trait</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">method</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This declaration would add various default impls. This would start with the <code>dyn Foo: Foo</code> impl:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="k">dyn</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="cm">/*[1]*/</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">method</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">&lt;</span><span class="k">dyn</span><span class="w"> </span><span class="n">Foo</span><span class="o">&gt;</span>::<span class="n">method</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="c1">// vtable dispatch
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// [1] actually it would want to cover `dyn Foo + Send` etc too, but I&#39;m ignoring that for now
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>But also, if the methods have suitable signatures, include some of the impls you <em>really ought</em> to have to make a trait that is well-behaved with respect to dyn trait:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Box</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">T</span>: <span class="o">?</span><span class="nb">Sized</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="o">&amp;</span><span class="n">T</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">T</span>: <span class="o">?</span><span class="nb">Sized</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">T</span>: <span class="o">?</span><span class="nb">Sized</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>In fact, if you add in the ability to declare a trait as <code>box</code>, things get very interesting:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">box</span><span class="w"> </span><span class="k">dyn</span><span class="w"> </span><span class="k">trait</span><span class="w"> </span><span class="n">Signal</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">signal</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>I&rsquo;m not 100% sure how this should work but what I imagine is that <code>dyn Foo</code> would be pointer-sized and implicitly contain a <code>Box</code> behind the scenes. It would probably automatically <code>Box</code> the results from <code>async fn</code> when invoked through <code>dyn Trait</code>, so something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="k">dyn</span><span class="w"> </span><span class="n">Signal</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">bar</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">&lt;</span><span class="k">dyn</span><span class="w"> </span><span class="n">Signal</span><span class="o">&gt;</span>::<span class="n">signal</span><span class="p">(</span><span class="bp">self</span><span class="p">).</span><span class="k">box</span><span class="p">.</span><span class="k">await</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>I didn&rsquo;t include this in the main blog post but I think together these ideas would go a long way towards addressing the usability gaps that plague <code>dyn Trait</code> today.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Side note, one interesting thing about Rust&rsquo;s async functions is that there size must be known at compile time, so we can&rsquo;t permit alloca-like stack allocation.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>The box keyword is in fact reserved already, but it&rsquo;s never been used in stable Rust.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Hat tip to Michael Goulet (compiler-errors) for pointing out to me that we can model the virtual dispatch as inherent methods on <code>dyn Trait</code> types. Before I thought we&rsquo;d have to make a more invasive addition to MIR, which I wasn&rsquo;t excited about since it suggested the change was more far-reaching.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>In the future, I think we can expand this definition to include some limited functions that use <code>impl Trait</code> in argument position, but that&rsquo;s for a future blog post.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>I&rsquo;ve noticed that many times when I favor a limited version of something to achieve some aesthetic principle I wind up regretting it.&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p>At least, it is not <code>dyn</code> compatible under today&rsquo;s rules. Convievably it could be made to work but more on that later.&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:7">
<p>This part of the change is similar to what was proposed in <a href="https://rust-lang.github.io/rfcs/2027-object_safe_for_dispatch.html?highlight=safety#">RFC #2027</a>, though that RFC was quite light on details (the requirements for RFCs in terms of precision have gone up over the years and I expect we wouldn&rsquo;t accept that RFC today in its current form).&#160;<a href="#fnref:7" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:8">
<p>I actually want to change this last clause in a future edition. Instead of having dyn compatibility be determined automically, traits would declare themselves dyn compatible, which would also come with a host of other impls. But that&rsquo;s worth a separate post all on its own.&#160;<a href="#fnref:8" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:9">
<p>If you <a href="https://play.rust-lang.org/?version=stable&amp;mode=release&amp;edition=2024&amp;gist=bf0b4ee4cbb13b02efc83455128110da">play with this on the playground</a>, you&rsquo;ll see that the memcpy appears in the debug build but gets optimized away in this very simple case, but that can be hard for LLVM to do, since it requires reordering an allocation of the box to occur earlier and so forth. The <code>.box</code> operator could be guaranteed to work.&#160;<a href="#fnref:9" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:10">
<p>I think it would be cool to also have some kind of unsafe intrinsic that permits calling the function with other storage strategies, e.g., allocating a known amount of stack space or what have you.&#160;<a href="#fnref:10" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:11">
<p>We would thus <em>finally</em> bring Rust enums to &ldquo;feature parity&rdquo; with OO classes! I wrote a <a href="https://smallcultfollowing.com/babysteps/
/blog/2015/05/29/classes-strike-back/">blog post, &ldquo;Classes strike back&rdquo;, on this topic</a> back in 2015 (!) as part of the whole &ldquo;virtual structs&rdquo; era of Rust design. Deep cut!&#160;<a href="#fnref:11" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/dyn-async-traits" term="dyn-async-traits" label="Dyn async traits"/></entry><entry><title type="html">Rust in 2025: Language interop and the extensible compiler</title><link href="https://smallcultfollowing.com/babysteps/blog/2025/03/18/lang-interop-extensibility/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2025/03/18/lang-interop-extensibility/</id><published>2025-03-18T00:00:00+00:00</published><updated>2025-03-18T15:34:25+00:00</updated><content type="html"><![CDATA[<p>For many years, C has effectively been the &ldquo;lingua franca&rdquo; of the computing world. It&rsquo;s pretty hard to combine code from two different programming languages in the same process&ndash;unless one of them is C. The same could theoretically be true for Rust, but in practice there are a number of obstacles that make that harder than it needs to be. Building out <strong>silky smooth language interop</strong> should be a core goal of helping Rust to target <a href="https://smallcultfollowing.com/babysteps/
/blog/2025/03/10/rust-2025-intro/">foundational applications</a>. I think the right way to do this is not by extending rustc with knowledge of other programming languages but rather by building on Rust&rsquo;s core premise of being an extensible language. By investing in building out an <strong>&ldquo;extensible compiler&rdquo;</strong> we can allow crate authors to create a plethora of ergonomic, efficient bridges between Rust and other languages.</p>
<h2 id="well-know-weve-succeeded-when">We&rsquo;ll know we&rsquo;ve succeeded when&hellip;</h2>
<p>When it comes to interop&hellip;</p>
<ul>
<li>It is easy to create a Rust crate that can be invoked from other languages and across multiple environments (desktop, Android, iOS, etc). Rust tooling covers the full story from writing the code to publishing your library.</li>
<li>It is easy<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> to carve out parts of an existing codebase and replace them with Rust. It is <em>particularly</em> easy to integrate Rust into C/C++ codebases.</li>
</ul>
<p>When it comes to extensibility&hellip;</p>
<ul>
<li>Rust is host to wide variety of extensions ranging from custom lints and diagnostics (&ldquo;clippy as a regular library&rdquo;) to integration and interop (ORMs, languages) to static analysis and automated reasoning^[math].</li>
</ul>
<h2 id="lang-interop-the-least-common-denominator-use-case">Lang interop: the <em>least common denominator</em> use case</h2>
<p>In my head, I divide language interop into two core use cases. The first is what I call <strong>Least Common Denominator</strong> (LCD), where people would like to write one piece of code and then use it in a wide variety of environments. This might mean authoring a core SDK that can be invoked from many languages but it also covers writing a codebase that can be used from both Kotlin (Android) and Swift (iOS) or having a single piece of code usable for everything from servers to embedded systems. It might also be creating <a href="https://bytecodealliance.org/">WebAssembly components</a> for use in browsers or on edge providers.</p>
<p>What distinguishes the LCD use-case is two things. First, it is primarily unidirectional&mdash;calls mostly go <em>from</em> the other language <em>to</em> Rust. Second, you don&rsquo;t have to handle all of Rust. You really want to expose an API that is &ldquo;simple enough&rdquo; that it can be expressed reasonably idiomatically from many other languages. Examples of libraries supporting this use case today are <a href="https://mozilla.github.io/uniffi-rs/latest/">uniffi</a> and <a href="https://rust-diplomat.github.io/book/">diplomat</a>. This problem is not new, it&rsquo;s the same basic use case that <a href="https://component-model.bytecodealliance.org/">WebAssembly components</a> are targeting as well as old school things like <a href="https://en.wikipedia.org/wiki/Component_Object_Model">COM</a> and <a href="https://en.wikipedia.org/wiki/Common_Object_Request_Broker_Architecture">CORBA</a> (in my view, though, each of those solutions is a bit too narrow for what we need).</p>
<p>When you dig in, the requirements for LCD get a bit more complicated. You want to start with simple types, yes, but quickly get people asking for the ability to make the generated wrapper from a given language more idiomatic. And you want to focus on calls <em>into</em> Rust, but you also need to support callbacks. In fact, to really integrate with other systems, you need generic facilities for things like logs, metrics, and I/O that can be mapped in different ways. For example, in a mobile environment, you don&rsquo;t necessarily want to use tokio to do an outgoing networking request. It is better to use the system libraries since they have special cases to account for the quirks of radio-based communication.</p>
<p>To really crack the LCD problem, you also have to solve a few other problems too:</p>
<ul>
<li>It needs to be easy to package up Rust code and upload it into the appropriate package managers for other languages. Think of a tool like <a href="https://github.com/PyO3/maturin">maturin</a>, which lets you bundle up Rust binaries as Python packages.</li>
<li>For some use cases, <strong>download size</strong> is a very important constraint. Optimizing for size right now is hard to start. What&rsquo;s worse, your binary has to include code from the standard library, since we can&rsquo;t expect to find it on the device&mdash;and even if we could, we couldn&rsquo;t be sure it was ABI compatible with the one you built your code with.</li>
</ul>
<h2 id="needed-the-serde-of-language-interop">Needed: the &ldquo;serde&rdquo; of language interop</h2>
<p>Obviously, there&rsquo;s enough here to keep us going for a long time. I think the place to start is building out something akin to the &ldquo;serde&rdquo; of language interop: the <a href="https://crates.io/crates/serde">serde</a> package itself just defines the core trait for serialization and a derive. All of the format-specific details are factored out into other crates defined by a variety of people.</p>
<p>I&rsquo;d like to see a universal set of conventions for defining the &ldquo;generic API&rdquo; that your Rust code follows and then a tool that extracts these conventions and hands them off to a backend to do the actual language specific work. It&rsquo;s not essential, but I think this core dispatching tool should live in the rust-lang org. All the language-specific details, on the other hand, would live in crates.io as crates that can be created by anyone.</p>
<h2 id="lang-interop-the-deep-interop-use-case">Lang interop: the &ldquo;deep interop&rdquo; use case</h2>
<p>The second use case is what I call the <strong>deep interop</strong> problem. For this use case, people want to be able to go deep in a particular language. Often this is because their Rust program needs to invoke APIs implemented in that other language, but it can also be that they want to stub out some part of that other program and replace it with Rust. One common example that requires deep interop is embedded developers looking to invoke gnarly C/C++ header files supplied by vendors. Deep interop also arises when you have an older codebase, such as the Rust for Linux project attempting to integrate Rust into their kernel or companies looking to integrate Rust into their existing codebases, most commonly C++ or Java.</p>
<p>Some of the existing deep interop crates focus specifically on the use case of invoking APIs from the other language (e.g., <a href="https://github.com/rust-lang/rust-bindgen">bindgen</a> and <a href="https://duchess-rs.github.io/duchess/">duchess</a>) but most wind up supporting bidirectional interaction (e.g., <a href="https://pyo3.rs/v0.23.5/">pyo3</a>, [npapi-rs][], and <a href="https://neon-rs.dev">neon</a>). One interesting example is <a href="https://cxx.rs">cxx</a>, which supports bidirectional Rust-C++ interop, but does so in a rather opinionated way, encouraging you to make use of a subset of C++&rsquo;s features that can be readily mapped (in this way, it&rsquo;s a bit of a hybrid of LCD and deep interop).</p>
<h2 id="interop-with-all-languages-is-important-c-and-c-are-just-more-so">Interop with all languages is important. C and C++ are just more so.</h2>
<p>I want to see smooth interop with all languages, but C and C++ are particularly important. This is because they have historically been the language of choice for foundational applications, and hence there is a lot of code that we need to integrate with. Integration with C today in Rust is, in my view, &ldquo;ok&rdquo; &ndash; most of what you need is there, but it&rsquo;s not as nicely integrated into the compiler or as accessible as it should be. Integration with C++ is a huge problem. I&rsquo;m happy to see the Foundation&rsquo;s <a href="https://rustfoundation.org/interop-initiative/">Rust-C++ Interoperability Initiative</a> as well a projects like Google&rsquo;s <a href="https://github.com/google/crubit">crubit</a> and of course the venerable <a href="https://github.com/dtolnay/cxx">cxx</a>.</p>
<h2 id="needed-the-extensible-compiler">Needed: &ldquo;the extensible compiler&rdquo;</h2>
<p>The traditional way to enable seamless interop with another language is to &ldquo;bake it in&rdquo; i.e., Kotlin has very smooth support for invoking Java code and Swift/Zig can natively build C and C++. I would prefer for Rust to take a different path, one I call <strong>the extensible compiler</strong>. The idea is to enable interop via, effectively, supercharged procedural macros that can integrate with the compiler to supply type information, generate shims and glue code, and generally manage the details of making Rust &ldquo;play nicely&rdquo; with another language.</p>
<p>In some sense, this is the same thing we do today. All the crates I mentioned above leverage procedural macros and custom derives to do their job. But procedural macrods today are the &ldquo;simplest thing that could possibly work&rdquo;: tokens in, tokens out. Considering how simplistic they are, they&rsquo;ve gotten us remarkably, but they also have distinct limitations. Error messages generated by the compiler are not expressed in terms of the macro input but rather the Rust code that gets generated, which can be really confusing; macros are not able to access type information or communicate information between macro invocations; macros cannot generate code on demand, as it is needed, which means that we spend time compiling code we might not need but also that we cannot integrate with monomorphization. And so forth.</p>
<p>I think we should integrate procedural macros more deeply into the compiler.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> I&rsquo;d like macros that can inspect types, that can generate code in response to monomorphization, that can influence diagnostics<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup> and lints, and maybe even customize things like method dispatch rules. That will allow all people to author crates that provide awesome interop with all those languages, but it will also help people write crates for all kinds of other things. To get a sense for what I&rsquo;m talking about, check out <a href="https://learn.microsoft.com/en-us/dotnet/fsharp/tutorials/type-providers/">F#&rsquo;s type providers</a> and what they can do.</p>
<p>The challenge here will be figuring out how to keep the stabilization surface area as small as possible. Whenever possible I would look for ways to have macros communicate by generating ordinary Rust code, perhaps with some small tweaks. Imagine macros that generate things like a &ldquo;virtual function&rdquo;, that has an ordinary Rust signature but where the body for a particular instance is constructed by a callback into the procedural macro during monomorphization. And what format should that body take? Ideally, it&rsquo;d just be Rust code, so as to avoid introducing any new surface area.</p>
<h2 id="not-needed-the-rust-evangelism-task-force">Not needed: the Rust Evangelism Task Force</h2>
<p>So, it turns out I&rsquo;m a big fan of Rust. And, I ain&rsquo;t gonna lie, when I see a prominent project pick some other language, at least in a scenario where Rust would&rsquo;ve done equally well, it makes me sad. And yet I also know that if <em>every</em> project were written in Rust, that would be <strong>so sad</strong>. I mean, who would we steal good ideas from?</p>
<p>I really like the idea of focusing our attention on <em>making Rust work well with other languages</em>, not on convincing people Rust is better <sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup>. The easier it is to add Rust to a project, the more people will try it &ndash; and if Rust is truly a better fit for them, they&rsquo;ll use it more and more.</p>
<h2 id="conclusion-next-steps">Conclusion: next steps</h2>
<p>This post pitched out a north star where</p>
<ul>
<li>a single Rust library can be easily used across many languages and environments;</li>
<li>Rust code can easily call and be called by functions in other languages;</li>
<li>this is all implemented atop a rich procedural macro mechanism that lets plugins inspect type information, generate code on demand, and so forth.</li>
</ul>
<p>How do we get there? I think there&rsquo;s some concrete next steps:</p>
<ul>
<li>Build out, adopt, or extend an easy system for producing &ldquo;least common denominator&rdquo; components that can be embedded in many contexts.</li>
<li>Support the C++ interop initiatives at the Foundation and elsewhere. The wheels are turning: tmandry is the point-of-contact for <a href="https://rust-lang.github.io/rust-project-goals/2025h1/seamless-rust-cpp.html">project goal</a> for that, and we recently held our <a href="https://hackmd.io/@rust-lang-team/rJvv36hq1e">first lang-team design meeting on the topic</a> (this document is a great read, highly recommended!).</li>
<li>Look for ways to extend proc macro capabilities and explore what it would take to invoke them from other phases of the compiler besides just the very beginning.
<ul>
<li>An aside: I also think we should extend rustc to support compiling proc macros to web-assembly and use that by default. That would allow for strong sandboxing and deterministic execution and also easier caching to support faster build times.</li>
</ul>
</li>
</ul>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Well, as easy as it can be.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Rust&rsquo;s incremental compilation system is pretty well suited to this vision. It works by executing an arbitrary function and then recording what bits of the program state that function looks at. The next time we run the compiler, we can see if those bits of state have changed to avoid re-running the function. The interesting thing is that this function could as well be part of a procedural macro, it doesn&rsquo;t have to be built-in to the compiler.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Stuff like the <a href="https://doc.rust-lang.org/reference/attributes/diagnostics.html#the-diagnostic-tool-attribute-namespace"><code>diagnostics</code> tool attribute namespace</a> is super cool! More of this!&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>I&rsquo;ve always been fond of this article <a href="https://thenewstack.io/rust-vs-go-why-theyre-better-together/">Rust vs Go, &ldquo;Why they&rsquo;re better together&rdquo;</a>.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/rust-in-2025" term="rust-in-2025" label="Rust in 2025"/></entry><entry><title type="html">Rust in 2025: Targeting foundational software</title><link href="https://smallcultfollowing.com/babysteps/blog/2025/03/10/rust-2025-intro/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2025/03/10/rust-2025-intro/</id><published>2025-03-10T00:00:00+00:00</published><updated>2025-03-10T13:33:43+00:00</updated><content type="html"><![CDATA[<p>Rust turns 10 this year. It&rsquo;s a good time to take a look at where we are and where I think we need to be going. This post is the first in a series I&rsquo;m calling &ldquo;Rust in 2025&rdquo;. This first post describes my general vision for how Rust fits into the computing landscape. The remaining posts will outline major focus areas that I think are needed to make this vision come to pass. Oh, and fair warning, I&rsquo;m expecting some controversy along the way&mdash;at least I hope so, since otherwise I&rsquo;m just repeating things everyone knows.</p>
<h2 id="my-vision-for-rust-foundational-software">My vision for Rust: foundational software</h2>
<p>I see Rust&rsquo;s mission as making it dramatically more accessible to author and maintain <em>foundational</em> software. By foundational I mean <em>the software that underlies everything else</em>. You can already see this in the areas where Rust is highly successful: CLI and development tools that everybody uses to do their work and which are often embedded into other tools<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>; cloud platforms that people use to run their applications<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>; embedded devices that are in the things <a href="https://docs.rust-embedded.org">around</a> (and <a href="https://www.youtube.com/watch?v=O09rje6yC90&amp;list=TLPQMjUxMDIwMjR6gKXQdU9PnA&amp;index=4">above</a>) us; and, increasingly, the kernels that run everything else (both <a href="https://www.theregister.com/2023/04/27/microsoft_windows_rust/">Windows</a> and <a href="https://rust-for-linux.com">Linux</a>!).</p>
<h3 id="foundational-software-needs-performance-reliabilityand-productivity">Foundational software needs performance, reliability&mdash;and productivity</h3>
<p>The needs of foundational software have a lot in common with all software, but everything is extra important. Reliability is paramount, because when the foundations fail, everything on top fails also. Performance overhead is to be avoided because it becomes a floor on the performance achievable by the layers above you.</p>
<p>Traditionally, achieving the extra-strong requirements of foundational software has meant that you can&rsquo;t do it with &ldquo;normal&rdquo; code. You had two choices. You could use C or C++<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>, which give great power but demand perfection in response<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup>. Or, you could use a higher-level language like Java or Go, but in a very particular way designed to keep performance high. You have to avoid abstractions and conveniences and minimizing allocations so as not to trigger the garbage collector.</p>
<p>Rust changed the balance by combining C++&rsquo;s innovations in zero-cost abstractions with a type system that can guarantee memory safety. The result is a pretty cool tool, one that (often, at least) lets you write high-level code with low-level performance and without fear of memory safety errors.</p>
<h3 id="empowerment-and-lowering-the-barrier-to-entry">Empowerment and lowering the barrier to entry</h3>
<p>In my Rust talks, I often say that type systems and static checks sound to most developers like &ldquo;spinach&rdquo;, something their parents forced them to eat because it was &ldquo;good for them&rdquo;, but not something anybody wants. The truth is that type systems <em>are</em> like spinach&mdash;popeye spinach. Having a type system to structure your thinking makes you more effective, regardless of your experience level. If you are a beginner, learning the type system helps you learn how to structure software for success. If you are an expert, the type system helps you create structures that will catch your mistakes faster (as well as those of your less experienced colleagues). Yehuda Katz sometimes says, &ldquo;When I&rsquo;m feeling alert, I build abstractions that will help tired Yehuda be more effective&rdquo;, which I&rsquo;ve always thought was a great way of putting it.</p>
<h3 id="what-about-non-foundational-software">What about non-foundational software?</h3>
<p>When I say that Rust&rsquo;s mission is to target foundational software, I don&rsquo;t mean that&rsquo;s all it&rsquo;s good for. Projects like <a href="https://dioxuslabs.com">Dioxus</a>, <a href="https://v2.tauri.app">Tauri</a>, and <a href="https://leptos.dev">Leptos</a> are doing fascinating, pioneering work pushing the boundaries of Rust into higher-level applications like GUIs and Webpages. I don&rsquo;t believe this kind of high-level development will ever be Rust&rsquo;s <em>sweet spot</em>. But that doesn&rsquo;t mean I think we should ignore them&mdash;in fact, quite the opposite.</p>
<h3 id="stretch-goals-are-how-you-grow">Stretch goals are how you grow</h3>
<p>The traditional thinking goes that, because foundational software often needs control over low-level details, it&rsquo;s not as important to focus on accessibility and <a href="https://blog.rust-lang.org/2017/03/02/lang-ergonomics.html">ergonomics</a>. In my view, though, the fact that foundational software needs control over low-level details only makes it <strong>more</strong> important to try and achieve good ergonomics. Anything you can do to help the developer focus on the details that matter most will make them more productive.</p>
<p>I think projects that stretch Rust to higher-level areas, like <a href="https://dioxuslabs.com">Dioxus</a>, <a href="https://v2.tauri.app">Tauri</a>, and <a href="https://leptos.dev">Leptos</a>, are a great way to identify opportunities to make Rust programming more convenient. These opportunities then trickle down to make Rust easier to use for everyone. The trick is to avoid losing the control and reliability that foundational applications need along the way (and it ain&rsquo;t always easy).</p>
<h3 id="cover-the-whole-stack">Cover the whole stack</h3>
<p>There&rsquo;s another reason to make sure that higher-level applications are pleasant in Rust: it means that people can build their entire stack using one technology. I&rsquo;ve talked to a number of people who expected just to use Rust for one thing, say a <a href="https://discord.com/blog/why-discord-is-switching-from-go-to-rust">tail-latency-sensitive data plane service</a>, but they wound up using it for everything. Why? Because it turned out that, once they learned it, Rust was quite productive and using one language meant they could share libraries and support code. Put another way, simple code is simple no matter what language you build it in.<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup></p>
<h3 id="smooth-iterative-deepening">&ldquo;Smooth, iterative deepening&rdquo;</h3>
<p>The other lesson I&rsquo;ve learned is that you want to enable what I think of as <em>smooth, iterative deepening</em>. This rather odd phrase is the one that always comes to my mind, somehow. The idea is that a user&rsquo;s first experience should be <em>simple</em>&ndash;they should be able to get up and going quickly. As they get further into their project, the user will find places where it&rsquo;s not doing what they want, and they&rsquo;ll need to take control. They should be able to do this in a localized way, changing one part of their project without disturbing everything else.</p>
<p>Smooth, iterative deepening sounds easy but is in fact very hard. Many projects fail either because the initial experience is hard or because the step from simple-to-control is in fact more like scaling a cliff, requiring users to learn a <em>lot</em> of background material. Rust certainly doesn&rsquo;t always succeed&ndash;but we succeed enough, and I like to think we&rsquo;re always working to do better.</p>
<h3 id="whats-to-come">What&rsquo;s to come</h3>
<p>This is the first post of the series. My current plan<sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup> is to post four follow-ups that cover what I see as the core investments we need to make to improve Rust&rsquo;s fit for foundational software. In my mind, the first three talk about how we should double down on some of Rust&rsquo;s core values:</p>
<ol>
<li>achieving <em>smooth language interop</em> by doubling down on <em>extensibility</em>;</li>
<li>extending the type system to achieve <em>clarity of purpose</em>;</li>
<li><em>leveling up the Rust ecosystem</em> by building out better guidelines, tools, and leveraging the Rust Foundation.</li>
</ol>
<p>After that, I&rsquo;ll talk about the Rust open-source organization and what I think we should be doing there to make contributing to and maintaining Rust as accessible and, dare I say it, joyful as we can.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Plenty of people use ripgrep, but did you know that when you do full text search in VSCode, you are <a href="https://github.com/microsoft/vscode-ripgrep">also using ripgrep</a>? And of course <a href="https://deno.com/">Deno</a> makes heavy use of Rust, as does a lot of Python tooling, like the <a href="https://github.com/astral-sh/uv">uv</a> package manager. The list goes on and on.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>What do AWS, Azure, CloudFlare, and Fastly all have in common? They&rsquo;re all big Rust users.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Rod Chapman tells me I should include Ada. He&rsquo;s not wrong, particularly if you are able to use SPARK to prove strong memory safety (and stronger properties, like panic freedom or even functional correctness). But Ada&rsquo;s never really caught on broadly, although it&rsquo;s very successful in certain spaces.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>Alas, we are but human.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>Well, that&rsquo;s true <em>if</em> the language meets a certain base bar. I&rsquo;d say that even &ldquo;simple&rdquo; code in C isn&rsquo;t all that simple, given that you don&rsquo;t even have basic types like vectors and hashmaps available.&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p>I reserve the right to change it as I go!&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/rust-in-2025" term="rust-in-2025" label="Rust in 2025"/></entry><entry><title type="html">View types redux and abstract fields</title><link href="https://smallcultfollowing.com/babysteps/blog/2025/02/25/view-types-redux/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2025/02/25/view-types-redux/</id><published>2025-02-25T00:00:00+00:00</published><updated>2025-02-25T16:04:46+00:00</updated><content type="html"><![CDATA[<p>A few years back I proposed <a href="https://smallcultfollowing.com/babysteps/
/blog/2021/11/05/view-types/">view types</a> as an extension to Rust’s type system to let us address the problem of (false) inter-procedural borrow conflicts. The basic idea is to introduce a “view type” <code>{f1, f2} Type</code><sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>, meaning “an instance of <code>Type</code> where you can only access the fields <code>f1</code> or <code>f2</code>”. The main purpose is to let you write function signatures like <code>&amp; {f1, f2} self</code> or <code>&amp;mut {f1, f2} self</code> that define what fields a given type might access. I was thinking about this idea again and I wanted to try and explore it a bit more deeply, to see how it could actually work, and to address the common question of how to have places in types without exposing the names of private fields.</p>
<h2 id="example-the-data-type">Example: the <code>Data</code> type</h2>
<p>The <code>Data</code> type is going to be our running example. The <code>Data</code> type collects experiments, each of which has a name and a set of <code>f32</code> values. In addition to the experimental data, it has a counter, <code>successful</code>, which indicates how many measurements were successful.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Data</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">experiments</span>: <span class="nc">HashMap</span><span class="o">&lt;</span><span class="nb">String</span><span class="p">,</span><span class="w"> </span><span class="nb">Vec</span><span class="o">&lt;</span><span class="kt">f32</span><span class="o">&gt;&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">successful</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>There are some helper functions you can use to iterate over the list of experiments and read their data. All of these return data borrowed from self. Today in Rust I would typically leverage lifetime elision, where the <code>&amp;</code> in the return type is automatically linked to the <code>&amp;self</code> argument:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Data</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">experiment_names</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">impl</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="nb">String</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">       </span><span class="bp">self</span><span class="p">.</span><span class="n">experiments</span><span class="p">.</span><span class="n">keys</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">for_experiment</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">experiment</span>: <span class="kp">&amp;</span><span class="kt">str</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kp">&amp;</span><span class="p">[</span><span class="kt">f32</span><span class="p">]</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">       </span><span class="n">experiments</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="n">experiment</span><span class="p">).</span><span class="n">unwrap_or</span><span class="p">(</span><span class="o">&amp;</span><span class="p">[])</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="tracking-successful-experiments">Tracking successful experiments</h2>
<p>Now imagine that <code>Data</code> has methods for reading and modifying the counter of successful experiments:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Data</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">successful</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">u32</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">successful</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">add_successful</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">successful</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="today-aggregate-types-like-data-present-a-composition-hazard">Today, “aggregate” types like Data present a composition hazard</h2>
<p>The <code>Data</code> type as presented thus far is pretty sensible, but it can actually be a pain to use. Suppose you wanted to iterate over the experiments, analyze their data, and adjust the successful counter as a result. You might try writing the following:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">count_successful_experiments</span><span class="p">(</span><span class="n">data</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">Data</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="w"> </span><span class="n">n</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="n">data</span><span class="p">.</span><span class="n">experiment_names</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">if</span><span class="w"> </span><span class="n">is_successful</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">for_experiment</span><span class="p">(</span><span class="n">n</span><span class="p">))</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">data</span><span class="p">.</span><span class="n">add_successful</span><span class="p">();</span><span class="w"> </span><span class="c1">// ERROR: data is borrowed here
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Experienced Rustaceans are likely shaking their head at this point—in fact, the previous code will not compile. What’s wrong? Well, the problem is that <code>experiment_names</code> returns data borrowed from <code>self</code> which then persists for the duration of the loop. Invoking <code>add_successful</code> then requires an <code>&amp;mut Data</code> argument, which causes a conflict.</p>
<p>The compiler is indeed flagging a reasonable concern here. The risk is that <code>add_successful</code> could mutate the <code>experiments</code> map while <code>experiment_names</code> is still iterating over it. Now, we as code authors know that this is unlikely — but let’s be honest, it may be unlikely <em>now</em>, but it’s not impossible that as <code>Data</code> evolves somebody might add some kind of logic into <code>add_successful</code> that would mutate the <code>experiments</code> map. This is precisely the kind of subtle interdependency that can make an innocuous “but it’s just one line!” PR cause a massive security breach. That’s all well and good, but it’s also very annoying that I can’t write this code.</p>
<h2 id="using-view-types-to-flag-what-is-happening">Using view types to flag what is happening</h2>
<p>The right fix here is to have a way to express what fields may be accessed in the type system. If we do this, then we can get the code to compile today <em>and</em> prevent future PRs from introducing bugs. This is hard to do with Rust’s current system, though, as types do not have any way of talking about fields, only spans of execution-time (“lifetimes”).</p>
<p>With view types, though, we can change the signature from <code>&amp;self</code> to <code>&amp;{experiments} self</code>. Just as <code>&amp;self</code> is shorthand for <code>self: &amp;Data</code>, this is actually shorthand for <code>self: &amp; {experiments} Data</code>.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Data</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">experiment_names</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">       </span><span class="o">&amp;</span><span class="w"> </span><span class="p">{</span><span class="n">experiments</span><span class="p">}</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">impl</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="nb">String</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">       </span><span class="bp">self</span><span class="p">.</span><span class="n">experiments</span><span class="p">.</span><span class="n">keys</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">for_experiment</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">&amp;</span><span class="w"> </span><span class="p">{</span><span class="n">experiments</span><span class="p">}</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">experiment</span>: <span class="kp">&amp;</span><span class="kt">str</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kp">&amp;</span><span class="p">[</span><span class="kt">f32</span><span class="p">]</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">experiments</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="n">experiment</span><span class="p">).</span><span class="n">unwrap_or</span><span class="p">(</span><span class="o">&amp;</span><span class="p">[])</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>We would also modify the <code>add_successful</code> method to flag what field it needs:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Data</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">add_successful</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="p">{</span><span class="n">successful</span><span class="p">}</span><span class="w"> </span><span class="bp">Self</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">       </span><span class="bp">self</span><span class="p">.</span><span class="n">successful</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="getting-a-bit-more-formal">Getting a bit more formal</h2>
<p>The idea of this post was to sketch out how view types could work in a slightly more detailed way. The basic idea is to extend Rust’s type grammar with a new type…</p>
<pre tabindex="0"><code>T = &amp;’a mut? T
  | [T]
  | Struct&lt;...&gt;
  | …
  | {field-list} T // &lt;— view types
</code></pre><p>We would also have some kind of expression for defining a view onto a place. This would be a place expression. For now I will write <code>E = {f1, f2} E</code> to define this expression, but that’s obviously ambiguous with Rust blocks. So for example you could write&hellip;</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">x</span>: <span class="p">(</span><span class="nb">String</span><span class="p">,</span><span class="w"> </span><span class="nb">String</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="nb">String</span>::<span class="n">new</span><span class="p">(),</span><span class="w"> </span><span class="nb">String</span>::<span class="n">new</span><span class="p">());</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">p</span>: <span class="kp">&amp;</span><span class="p">{</span><span class="mi">0</span><span class="p">}</span><span class="w"> </span><span class="p">(</span><span class="nb">String</span><span class="p">,</span><span class="w"> </span><span class="nb">String</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="w"> </span><span class="p">{</span><span class="mi">0</span><span class="p">}</span><span class="w"> </span><span class="n">x</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">q</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="p">{</span><span class="mi">1</span><span class="p">}</span><span class="w"> </span><span class="p">(</span><span class="nb">String</span><span class="p">,</span><span class="w"> </span><span class="nb">String</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="p">{</span><span class="mi">1</span><span class="p">}</span><span class="w"> </span><span class="n">x</span><span class="p">;</span><span class="w">
</span></span></span></code></pre></div><p>&hellip;to get a reference <code>p</code> that can only access the field <code>0</code> of the tuple and a reference <code>q</code> that can only access field <code>1</code>. Note the difference between <code>&amp;{0}x</code>, which creates a reference to the entire tuple but with limited access, and <code>&amp;x.0</code>, which creates a reference to the field itself. Both have their place.</p>
<h2 id="checking-field-accesses-against-view-types">Checking field accesses against view types</h2>
<p>Consider this function from our example:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Data</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">add_successful</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="p">{</span><span class="n">successful</span><span class="p">}</span><span class="w"> </span><span class="bp">Self</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">       </span><span class="bp">self</span><span class="p">.</span><span class="n">successful</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>How would we type check the <code>self.successful += 1</code> statement? Today, without view types, typing an expression like <code>self.successful</code> begins by getting the type of <code>self</code>, which is something like <code>&amp;mut Data</code>. We then “auto-deref”, looking for the struct type within. That would bring us to <code>Data</code>, at which point we would check to see if <code>Data</code> defines a field <code>successful</code>.</p>
<p>To integrate view types, we have to track both the type of data being accessed and the set of allowed fields. Initially we have variable <code>self</code> with type <code>&amp;mut {successful} Data</code> and allow set <code>*</code>. The deref would bring us to <code>{successful} Data</code> (allow-set remains <code>*</code>). Traversing a view type modifies the allow-set, so we go from <code>*</code> to <code>{successful}</code> (to be legal, every field in the view must be allowed). We now have the type <code>Data</code>. We would then identify the field <code>successful</code> as both a member of <code>Data</code> and a member of the allow-set, and so this code would be successful.</p>
<p>If however you tried to modify a function to access a field not declared as part of its view, e.g.,</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Data</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">add_successful</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="p">{</span><span class="n">successful</span><span class="p">}</span><span class="w"> </span><span class="bp">Self</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">       </span><span class="fm">assert!</span><span class="p">(</span><span class="o">!</span><span class="bp">self</span><span class="p">.</span><span class="n">experiments</span><span class="p">.</span><span class="n">is_empty</span><span class="p">());</span><span class="w"> </span><span class="c1">// &lt;— modified to include this
</span></span></span><span class="line"><span class="cl"><span class="w">       </span><span class="bp">self</span><span class="p">.</span><span class="n">successful</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>the <code>self.experiments</code> type-checking would now fail, because the field <code>experiments</code> would not be a member of the allow-set.</p>
<h2 id="we-need-to-infer-allow-sets">We need to infer allow sets</h2>
<p>A more interesting problem comes when we type-check a call to <code>add_successful()</code>. We had the following code:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">count_successful_experiments</span><span class="p">(</span><span class="n">data</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">Data</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="w"> </span><span class="n">n</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="n">data</span><span class="p">.</span><span class="n">experiment_names</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">if</span><span class="w"> </span><span class="n">is_successful</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">for_experiment</span><span class="p">(</span><span class="n">n</span><span class="p">))</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">data</span><span class="p">.</span><span class="n">add_successful</span><span class="p">();</span><span class="w"> </span><span class="c1">// Was error, now ok.
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Consider the call to <code>data.experiment_names()</code>. In the compiler today, method lookup begins by examining <code>data</code>, of type <code>&amp;mut Data</code>, auto-deref’ing by one step to yield <code>Data</code>, and then auto-ref’ing to yield <code>&amp;Data</code>. The result is this method call is desugared to a call like <code>Data::experiment_names(&amp;*data)</code>.</p>
<p>With view types, when introducing the auto-ref, we would also introduce a view operation. So we would get <code>Data::experiment_names(&amp; {?X} *data)</code>. What is this <code>{?X}</code>? That indicates that the set of allowed fields has to be inferred. A place-set variable <code>?X</code> can be inferred to a set of fields or to <code>*</code> (all fields).</p>
<p>We would integrate these place-set variables into inference, so that <code>{?A} Ta &lt;: {?B} Tb</code> if <code>?B</code> is a subset of <code>?A</code> and <code>Ta &lt;: Tb</code> (e.g., <code>[x, y] Foo &lt;: [x] Foo</code>). We would also for dropping view types from subtypes, e.g., <code>{*} Ta &lt;: Tb</code> if <code>Ta &lt;: Tb</code>.</p>
<p>Place-set variables only appear as an internal inference detail, so users can’t (e.g.) write a function that is generic over a place-set, and the only kind of constraints you can get are subset (<code>P1 &lt;= P2</code>) and inclusion (<code>f in P1</code>). I <em>think</em> it should be relatively straightforward to integrate these into HIR type check inference. When generalizing, we can replace each specific view set with a variable, just as we do for lifetimes. When we go to construct MIR, we would always know the precise set of fields we wish to include in the view. In the case where the set of fields is <code>*</code> we can also omit the view from the MIR.</p>
<h2 id="abstract-fields">Abstract fields</h2>
<p>So, view types allow us to address these sorts of conflicts by making it more explicit what sets of types we are going to access, but they introduce a new problem — does this mean that the names of our private fields become part of our interface? That seems obviously undesirable.</p>
<p>The solution is to introduce the idea of <em>abstract</em><sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> fields. An <em>abstract</em> field is a kind of pretend field, one that doesn’t really exist, but which you can talk about “as if” it existed. It lets us give symbolic names to data.</p>
<p>Abstract fields would be defined as aliases for a set of fields, like <code>pub abstract field_name = (list-of-fields)</code>. An alias defines a public symbolic names for a set of fields.</p>
<p>We could therefore define two aliases for <code>Data</code>, one for the set of experiments and one for the count of successful experiments. I think it be useful to allow these names to alias actual field names, as I think that in practice the compiler can always tell which set to use, but I would require that <em>if</em> there is an alias, then the abstract field is aliased to the actual field with the same name.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Data</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="kr">abstract</span><span class="w"> </span><span class="n">experiments</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">experiments</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">experiments</span>: <span class="nc">HashMap</span><span class="o">&lt;</span><span class="nb">String</span><span class="p">,</span><span class="w"> </span><span class="nb">Vec</span><span class="o">&lt;</span><span class="kt">f32</span><span class="o">&gt;&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="kr">abstract</span><span class="w"> </span><span class="n">successful</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">successful</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">successful</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now the view types we wrote earlier (<code>&amp; {experiments} self</code>, etc) are legal but they refer to the <em>abstract</em> fields and not the actual fields.</p>
<h2 id="abstract-fields-permit-refactoring">Abstract fields permit refactoring</h2>
<p>One nice property of abstract fields is that they permit refactoring. Imagine that we decide to change <code>Data</code> so that instead of storing experiments as a <code>Map&lt;String, Vec&lt;f32&gt;&gt;</code>, we put all the experimental data in one big vector and store a range of indices in the map, like <code>Map&lt;String, (usize, usize)&gt;</code>. We can do that no problem:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Data</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="kr">abstract</span><span class="w"> </span><span class="n">experiments</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="n">experiment_names</span><span class="p">,</span><span class="w"> </span><span class="n">experiment_data</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">experiment_indices</span>: <span class="nc">Map</span><span class="o">&lt;</span><span class="nb">String</span><span class="p">,</span><span class="w"> </span><span class="p">(</span><span class="kt">usize</span><span class="p">,</span><span class="w"> </span><span class="kt">usize</span><span class="p">)</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">experiment_data</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="kt">f32</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// ...
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>We would still declare methods like <code>&amp;mut {experiments} self</code>, but the compiler now understands that the abstract field <code>experiments</code> can be expanded to the set of private fields.</p>
<h2 id="frequently-asked-questions">Frequently asked questions</h2>
<h3 id="can-abstract-fields-be-mapped-to-an-empty-set-of-fields">Can abstract fields be mapped to an empty set of fields?</h3>
<p>Yes, I think it should be possible to define <code>pub abstract foo;</code> to indicate the empty set of fields.</p>
<h3 id="how-do-view-types-interact-with-traits-and-impls">How do view types interact with traits and impls?</h3>
<p>Good question. There is no <em>necessary</em> interaction, we could leave view types as simply a kind of type. You might do interesting things like implement <code>Deref</code> for a view on your struct:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">AugmentedData</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="kt">u32</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">summary</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Deref</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="p">{</span><span class="n">data</span><span class="p">}</span><span class="w"> </span><span class="n">AugmentedData</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Target</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">[</span><span class="kt">u32</span><span class="p">];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">deref</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kp">&amp;</span><span class="p">[</span><span class="kt">u32</span><span class="p">]</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// type of `self` is `&amp;{data} AugmentedData`
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">&amp;</span><span class="bp">self</span><span class="p">.</span><span class="n">data</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="ok-you-dont-need-to-integrate-abstract-fields-with-traits-but-could-you">OK, you don’t need to integrate abstract fields with traits, but could you?</h3>
<p>Yes! And it’d be interesting. You could imagine declaring abstract fields as trait members that can appear in its interface:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Interface</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kr">abstract</span><span class="w"> </span><span class="n">data1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kr">abstract</span><span class="w"> </span><span class="n">data2</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">get_data1</span><span class="p">(</span><span class="o">&amp;</span><span class="p">{</span><span class="n">data1</span><span class="p">}</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">u32</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">get_data2</span><span class="p">(</span><span class="o">&amp;</span><span class="p">{</span><span class="n">data2</span><span class="p">}</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">u32</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>You could then define those fields in an impl. You can even map some of them to real fields and leave some as purely abstract:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">OneCounter</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">counter</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Interface</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">OneCounter</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kr">abstract</span><span class="w"> </span><span class="n">data1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">counter</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kr">abstract</span><span class="w"> </span><span class="n">data2</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">get_data1</span><span class="p">(</span><span class="o">&amp;</span><span class="p">{</span><span class="n">counter</span><span class="p">}</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">u32</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">counter</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">get_data2</span><span class="p">(</span><span class="o">&amp;</span><span class="p">{</span><span class="n">data2</span><span class="p">}</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">u32</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="mi">0</span><span class="w"> </span><span class="c1">// no fields needed
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="could-view-types-include-more-complex-paths-than-just-fields">Could view types include more complex paths than just fields?</h3>
<p>Although I wouldn’t want to at first, I think you could permit something like <code>{foo.bar} Baz</code> and then, given something like <code>&amp;foo.bar</code>, you’d get the type <code>&amp;{bar} Baz</code>, but I’ve not really thought it more deeply than that.</p>
<h3 id="can-view-types-be-involved-in-moves">Can view types be involved in moves?</h3>
<p>Yes! You should be able to do something like</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Strings</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">a</span>: <span class="nb">String</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">b</span>: <span class="nb">String</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">c</span>: <span class="nb">String</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">play_games</span><span class="p">(</span><span class="n">s</span>: <span class="nc">Strings</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Moves the struct `s` but only the fields `a` and `c`
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">t</span>: <span class="p">{</span><span class="n">a</span><span class="p">,</span><span class="w"> </span><span class="n">c</span><span class="p">}</span><span class="w"> </span><span class="n">Strings</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">{</span><span class="n">a</span><span class="p">,</span><span class="w"> </span><span class="n">c</span><span class="p">}</span><span class="w"> </span><span class="n">s</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="fm">println!</span><span class="p">(</span><span class="err">“</span><span class="p">{</span><span class="n">s</span><span class="p">.</span><span class="n">a</span><span class="p">}</span><span class="err">”</span><span class="p">);</span><span class="w"> </span><span class="c1">// ERROR: s.a has been moved
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="fm">println!</span><span class="p">(</span><span class="err">“</span><span class="p">{</span><span class="n">s</span><span class="p">.</span><span class="n">b</span><span class="p">}</span><span class="err">”</span><span class="p">);</span><span class="w"> </span><span class="c1">// OK.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="fm">println!</span><span class="p">(</span><span class="err">“</span><span class="p">{</span><span class="n">s</span><span class="p">.</span><span class="n">c</span><span class="p">}</span><span class="err">”</span><span class="p">);</span><span class="w"> </span><span class="c1">// ERROR: s.a has been moved
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="fm">println!</span><span class="p">(</span><span class="err">“</span><span class="p">{</span><span class="n">t</span><span class="p">.</span><span class="n">a</span><span class="p">}</span><span class="err">”</span><span class="p">);</span><span class="w"> </span><span class="c1">// OK.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="fm">println!</span><span class="p">(</span><span class="err">“</span><span class="p">{</span><span class="n">t</span><span class="p">.</span><span class="n">b</span><span class="p">}</span><span class="err">”</span><span class="p">);</span><span class="w"> </span><span class="c1">// ERROR: no access to field `b`.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="fm">println!</span><span class="p">(</span><span class="err">“</span><span class="p">{</span><span class="n">t</span><span class="p">.</span><span class="n">c</span><span class="p">}</span><span class="err">”</span><span class="p">);</span><span class="w"> </span><span class="c1">// OK.
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="why-did-you-have-a-subtyping-rules-to-drop-view-types-from-sub--but-not-super-types">Why did you have a subtyping rules to drop view types from sub- but not super-types?</h3>
<p>I described the view type subtyping rules as two rules:</p>
<ul>
<li><code>{?A} Ta &lt;: {?B} Tb</code> if <code>?B</code> is a subset of <code>?A</code> and <code>Ta &lt;: Tb</code></li>
<li><code>{*} Ta &lt;: Tb</code> if <code>Ta &lt;: Tb</code></li>
</ul>
<p>In principle we could have a rule like <code>Ta &lt;: {*} Tb</code> if <code>Ta &lt;: Tb</code> — this rule would allow “introducing” a view type into the supertype. We may wind up needing such a rule but I didn’t want it because it meant that code like this really ought to compile (using the <code>Strings</code> type from the previous question):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">play_games</span><span class="p">(</span><span class="n">s</span>: <span class="nc">Strings</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">   </span><span class="kd">let</span><span class="w"> </span><span class="n">t</span>: <span class="p">{</span><span class="n">a</span><span class="p">,</span><span class="w"> </span><span class="n">c</span><span class="p">}</span><span class="w"> </span><span class="n">Strings</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">s</span><span class="p">;</span><span class="w"> </span><span class="c1">// &lt;— just `= s`, not `= {a, c} s`.
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>I would expect this to compile because</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="p">{</span><span class="n">a</span><span class="p">,</span><span class="w"> </span><span class="n">c</span><span class="p">}</span><span class="w"> </span><span class="n">Strings</span><span class="w"> </span><span class="o">&lt;</span>: <span class="p">{</span><span class="o">*</span><span class="p">}</span><span class="w"> </span><span class="n">Strings</span><span class="w"> </span><span class="o">&lt;</span>: <span class="nc">Strings</span><span class="w">
</span></span></span></code></pre></div><p>but I kind of don’t want it to compile.</p>
<h3 id="are-there-other-uses-for-abstract-fields">Are there other uses for abstract fields?</h3>
<p>Yes! I think abstract fields would also be useful in two other ways (though we have to stretch their definition a bit). I believe it’s important for Rust to grow stronger integration with theorem provers; I don’t expect these to be widely used, but for certain key libraries (stdlib, zerocopy, maybe even tokio) it’d be great to be able to mathematically prove type safety. But mathematical proof systems often require a notion of <em>ghost fields</em> — basically logical state that doesn’t really exist at runtime but which you can talk about in a proof. A <em>ghost field</em> is essentially an abstract field that is mapped to an empty set of fields and which has a type. For example you might declare a <code>BeanCounter</code> struct with two abstract fields (<code>a</code>, <code>b</code>) and one real field that stores their sum:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">BeanCounter</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="kr">abstract</span><span class="w"> </span><span class="n">a</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="kr">abstract</span><span class="w"> </span><span class="n">b</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">sum</span>: <span class="kt">u32</span><span class="p">,</span><span class="w"> </span><span class="c1">// &lt;— at runtime, we only store the sum
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>then when you create <code>BeanCounter</code> you would specify a value for those fields. The value would perhaps be written using something like an abstract block, indicating that in fact the code within will not be executed (but must still be type checkable):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">BeanCounter</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">new</span><span class="p">(</span><span class="n">a</span>: <span class="kt">u32</span><span class="p">,</span><span class="w"> </span><span class="n">b</span>: <span class="kt">u32</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">Self</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">a</span>: <span class="nc">abstract</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">a</span><span class="w"> </span><span class="p">},</span><span class="w"> </span><span class="n">b</span>: <span class="nc">abstract</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">b</span><span class="w"> </span><span class="p">},</span><span class="w"> </span><span class="n">sum</span>: <span class="nc">a</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">b</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Providing abstract values is useful because it lets the theorem prover act “as if” the code was there for the purpose of checking pre- and post-conditions and other kinds of contracts.</p>
<h3 id="could-we-use-abstract-fields-to-replace-phantom-data">Could we use abstract fields to replace phantom data?</h3>
<p>Yes! I imagine that instead of <code>a: PhantomData&lt;T&gt;</code> you could do <code>abstract a: T</code>, but that would mean we’d have to have some abstract initializer. So perhaps we permit an anonymous field <code>abstract _: T</code>, in which case you wouldn’t be required to provide an initializer, but you also couldn’t name it in contracts.</p>
<h3 id="so-what-are-all-the-parts-to-an-abstract-field">So what are all the parts to an abstract field?</h3>
<p>I would start with just the simplest form of abstract fields, which is an alias for a set of real fields. But to extend to cover ghost fields or <code>PhantomData</code>, you want to support the ability to declare a type for abstract fields (we could say that the default if <code>()</code>). For fields with non-<code>()</code> types, you would be expected to provide an abstract value in the struct constructor. To conveniently handle <code>PhantomData</code>, we could add anonymous abstract fields where no type is needed.</p>
<h3 id="should-we-permit-view-types-on-other-types">Should we permit view types on other types?</h3>
<p>I’ve shown view types attached to structs and tuples. Conceivably we could permit them elsewhere, e.g., <code>{0} &amp;(String, String)</code> might be equivalent to <code>&amp;{0} (String, String)</code>. I don’t think that’s needed for now and I’d make it ill-formed, but it could be reasonable to support at some point.</p>
<h2 id="conclusion">Conclusion</h2>
<p>This concludes my exploration through view types. The post actually changed as I wrote it — initially I expected to include place-based borrows, but it turns out we didn’t really need those. I also initially expected view types to be a special case of struct types, and that indeed might simplify things, but I wound up concluding that they are a useful type constructor on their own. In particular if we want to integrate them into traits it will be necessary for them to be applied to generics and the rest.≈g</p>
<p>In terms of next steps, I’m not sure, I want to think about this idea, but I do feel we need to address this gap in Rust, and so far view types seem like the most natural. I think what could be interesting is to prototype them in a-mir-formality as it evolves to see if there are other surprises that arise.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>I’m not really proposing this syntax—among other things, it is ambiguous in expression position. I’m not sure what the best syntax is, though! It’s an important question, but not one I will think hard about here.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>I prefer the name <em>ghost</em> fields, because it’s spooky, but <em>abstract</em> is already a reserved keyword.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><title type="html">Rust 2024 Is Coming</title><link href="https://smallcultfollowing.com/babysteps/blog/2025/02/20/rust-2024-is-coming/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2025/02/20/rust-2024-is-coming/</id><published>2025-02-20T00:00:00+00:00</published><updated>2025-02-20T10:37:08+00:00</updated><content type="html"><![CDATA[<p>So, a little bird told me that Rust 2024 is going to become stable today, along with Rust 1.85.0. In honor of this momentous event, I have penned a little ditty that I&rsquo;d like to share with you all. Unfortunately, for those of you who remember Rust 2021&rsquo;s <a href="https://smallcultfollowing.com/babysteps/
/blog/2021/05/26/edition-the-song/">&ldquo;Edition: The song&rdquo;</a>, in the 3 years between Rust 2021 and now, my daughter has realized that her father is deeply uncool<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> and so I had to take this one on solo<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>. Anyway, enjoy! Or, you know, suffer. As the case may be.</p>
<h3 id="video">Video</h3>
<p>Watch the movie embedded here, or <a href="https://youtu.be/thdpaw_3VTw?si=ezmhK9fXdWNNNVug">watch it on YouTube</a>:</p>
<div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
      <iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share; fullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube.com/embed/thdpaw_3VTw?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"></iframe>
    </div>

<h3 id="lyrics">Lyrics</h3>
<p>In ChordPro format, for those of you who are inspired to play along.</p>
<pre tabindex="0"><code>{title: Rust 2024}
{subtitle: }

{key: C}

[Verse 1]
[C] When I got functions that never return
I write an exclamation point [G]
But use it for an error that could never be
the compiler [C] will yell at me

[Verse 2]
[C] We Rust designers, we want that too
[C7] But we had to make a [F] change
[F] That will be [Fm]better
[C] Oh so much [A]better
[D] in Rust Twenty [G7]Twenty [C]Four

[Bridge]
[Am] ... [Am] But will my program [E] build?
[Am] Yes ... oh that’s [D7] for sure
[F] edi-tions [G] are [C] opt in

[Verse 3]
[C] Usually when I return an `impl Trait`
everything works out fine [G]
but sometimes I need a tick underscore
and I don’t really [C] know what that’s for

[Verse 4]
[C] We Rust designers we do agree
[C7] That was con- [F] fusing 
[F] But that will be [Fm]better
[C] Oh so much [A]better
[D] in Rust Twenty [G7]Twenty [C]Four

[Bridge 2]
[Am] Cargo fix will make the changes
automatically [G] Oh that sure sounds great...
[Am] but wait... [Am] my de-pen-denc-[E]-ies
[Am] Don’t worry e-[D7]ditions
[F] inter [G] oper [C] ate

[Verse 5]
[C] Whenever I match on an ampersand T
The borrow [G] propagates
But where do I put the ampersand
when I want to [C] copy again?

[Verse 6]
[C] We Rust designers, we do agree
[C7] That really had to [F] change
[F] That will be [Fm]better
[C] Oh so much [A]better
[D] in Rust Twenty [G7]Twenty [C]Four

[Outro]
[F] That will be [Fm]better
[C] Oh so much [A]better
[D] in Rust Twenty [G7]Twenty [C]Four

One more time!

[Half speed]
[F] That will be [Fm]better
[C] Oh so much [A]better
[D] in Rust Twenty [G7]Twenty [C]Four
</code></pre><div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>It was bound to happen eventually.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Actually, I had a plan to make this a duet with somebody who shall remain nameless (they know who they are). But I was too lame to get everything done on time. In fact, I may or may not have realized &ldquo;Oh, shit, I need to finish this recording!&rdquo; while in the midst of a beer with Florian Gilcher last night. Anyway, sorry, would-be-collaborator-I -was-really-looking-forward-to-playing-with! Next time!&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><title type="html">How I learned to stop worrying and love the LLM</title><link href="https://smallcultfollowing.com/babysteps/blog/2025/02/10/love-the-llm/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2025/02/10/love-the-llm/</id><published>2025-02-10T00:00:00+00:00</published><updated>2025-02-10T15:56:19+00:00</updated><content type="html"><![CDATA[<p>I believe that AI-powered development tools can be a game changer for Rust&mdash;and vice versa. At its core, my argument is simple: AI&rsquo;s ability to explain and diagnose problems with rich context can help people get over the initial bump of learning Rust in a way that canned diagnostics never could, no matter how hard we try. At the same time, rich type systems like Rust&rsquo;s give AIs a lot to work with, which could be used to help them avoid hallucinations and validate their output. This post elaborates on this premise and sketches out some of the places where I think AI could be a powerful boost.</p>
<h2 id="perceived-learning-curve-is-challenge-1-for-rust">Perceived learning curve is challenge #1 for Rust</h2>
<p>Is Rust good for every project? No, of course not. But it&rsquo;s absolutely <strong>great</strong> for some things&mdash;specifically, building reliable, robust software that performs well at scale. This is no accident. Rust&rsquo;s design is intended to surface important design questions (often in the form of type errors) and to give users the control to fix them in whatever way is best.</p>
<p>But this same strength is also Rust&rsquo;s biggest challenge. Talking to people within Amazon about adopting Rust, perceived complexity and fear of its learning curve is the biggest hurdle. Most people will say, <em>&ldquo;Rust seems interesting, but I don&rsquo;t need it for this problem&rdquo;</em>. And you know, they&rsquo;re right! They don&rsquo;t <em>need</em> it. But that doesn&rsquo;t mean they wouldn&rsquo;t benefit from it.</p>
<p>One of Rust&rsquo;s big surprises is that, once you get used to it, it&rsquo;s &ldquo;surprisingly decent&rdquo; at very large number of things beyond what it was designed for. Simple business logic and scripts can be very pleasant in Rust. But the phase &ldquo;once you get used to it&rdquo; in that sentence is key, since most people&rsquo;s initial experience with Rust is <strong>confusion and frustration</strong>.</p>
<h2 id="rust-likes-to-tell-you-no-but-its-for-your-own-good">Rust likes to tell you <em>no</em> (but it&rsquo;s for your own good)</h2>
<p>Some languages are geared to say <em>yes</em>&mdash;that is, given any program, they aim to run it and do <em>something</em>. JavaScript is of course the most extreme example (no semicolons? no problem!) but every language does this to some degree. It&rsquo;s often quite elegant. Consider how, in Python, you write <code>vec[-1]</code> to get the last element in the list: super handy!</p>
<p>Rust is not (usually) like this. Rust is geared to say <em>no</em>. The compiler is just <em>itching</em> for a reason to reject your program. It&rsquo;s not that Rust is mean: Rust just wants your program to be as good as it can be. So we try to make sure that your program will do what you <em>want</em> (and not just what you asked for). This is why <code>vec[-1]</code>, in Rust, will panic: sure, giving you the last element might be convenient, but how do we know you didn&rsquo;t have an off-by-one bug that resulted in that negative index?<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<p>But that tendency to say <em>no</em> means that early learning can be pretty frustrating. For most people, the reward from programming comes from seeing their program run&mdash;and with Rust, there&rsquo;s a <em>lot</em> of niggling details to get right before your program will run. What&rsquo;s worse, while those details are often motivated by deep properties of your program (like data races), the way they are <em>presented</em> is as the violation of obscure rules, and the solution (&ldquo;add a <code>*</code>&rdquo;) can feel random.</p>
<p>Once you get the hang of it, Rust feels great, but getting there can be a pain. I heard a great phrase from someone at Amazon to describe this: &ldquo;Rust: the language where you get the hangover first&rdquo;.<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></p>
<h2 id="ai-today-helps-soften-the-learning-curve">AI today helps soften the learning curve</h2>
<p>My favorite thing about working at Amazon is getting the chance to talk to developers early in their Rust journey. Lately I&rsquo;ve noticed an increasing trend&mdash;most are using Q Developer. Over the last year, Amazon has been doing a lot of internal promotion of Q Developer, so that in and of itself is no surprise, but what did surprise me a bit is hearing from developers the <em>way</em> that they use it.</p>
<p>For most of them, the most valuable part of Q Dev is authoring code but rather <strong>explaining</strong> it. They ask it questions like &ldquo;why does this function take an <code>&amp;T</code> and not an <code>Arc&lt;T&gt;</code>?&rdquo; or &ldquo;what happens when I move a value from one place to another?&rdquo;. Effectively, the LLM becomes an ever-present, ever-patient teacher.<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup></p>
<h2 id="scaling-up-the-rust-expert">Scaling up the Rust expert</h2>
<p>Some time back I sat down with an engineer learning Rust at Amazon. They asked me about an error they were getting that they didn&rsquo;t understand. &ldquo;The compiler is telling me something about <code>‘static</code>, what does that mean?&rdquo; Their code looked something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">log_request_in_background</span><span class="p">(</span><span class="n">message</span>: <span class="kp">&amp;</span><span class="kt">str</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">tokio</span>::<span class="n">spawn</span><span class="p">(</span><span class="k">async</span><span class="w"> </span><span class="k">move</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">log_request</span><span class="p">(</span><span class="n">message</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>And the <a href="https://play.rust-lang.org/?version=stable&amp;mode=debug&amp;edition=2021&amp;gist=0985ec4502f7ca148cc8919f2081ad02">compiler was telling them</a>:</p>
<pre tabindex="0"><code>error[E0521]: borrowed data escapes outside of function
 --&gt; src/lib.rs:2:5
  |
1 |   async fn log_request_in_background(message: &amp;str) {
  |                                      -------  - let&#39;s call the lifetime of this reference `&#39;1`
  |                                      |
  |                                      `message` is a reference that is only valid in the function body
2 | /     tokio::spawn(async move {
3 | |         log_request(message);
4 | |     });
  | |      ^
  | |      |
  | |______`message` escapes the function body here
  |        argument requires that `&#39;1` must outlive `&#39;static`
</code></pre><p>This is a pretty good error message! And yet it requires significant context to understand it (not to mention scrolling horizontally, sheesh). For example, what is &ldquo;borrowed data&rdquo;? What does it mean for said data to &ldquo;escape&rdquo;? What is a &ldquo;lifetime&rdquo; and what does it mean that &ldquo;<code>'1</code> must outlive <code>'static</code>&rdquo;? Even assuming you get the basic point of the message, what should you <strong>do</strong> about it?</p>
<h2 id="the-fix-is-easy-if-you-know-what-to-do">The fix is easy&hellip; <em>if</em> you know what to do</h2>
<p>Ultimately, the answer to the engineer&rsquo;s problem was just to insert a call to <code>clone</code><sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup>. But deciding on that fix requires a surprisingly large amount of context. In order to figure out the right next step, I first explained to the engineer that this confusing error is, in fact, <a href="https://smallcultfollowing.com/babysteps/
/blog/2022/06/15/what-it-feels-like-when-rust-saves-your-bacon/">what it feels like when Rust saves your bacon</a>, and talked them through how the ownership model works and what it means to free memory. We then discussed why they were spawning a task in the first place (the answer: to avoid the latency of logging)&mdash;after all, the right fix might be to just not spawn at all, or to use something like rayon to block the function until the work is done.</p>
<p>Once we established that the task needed to run asynchronously from its parent, and hence had to own the data, we looked into changing the <code>log_request_in_background</code> function to take an <code>Arc&lt;String&gt;</code> so that it could avoid a deep clone. This would be more efficient, but only if the caller themselves could cache the <code>Arc&lt;String&gt;</code> somewhere. It turned out that the origin of this string was in another team&rsquo;s code and that this code only returned an <code>&amp;str</code>. Refactoring that code would probably be the best long term fix, but given that the strings were expected to be quite short, we opted to just clone the string.</p>
<h2 id="you-can-learn-a-lot-from-a-rust-error">You can learn a lot from a Rust error</h2>
<blockquote>
<p>An error message is often your first and best chance to teach somebody something.&mdash;Esteban Küber (paraphrased)</p>
</blockquote>
<p>Working through this error was valuable. It gave me a chance to teach this engineer a number of concepts. I think it demonstrates a bit of Rust&rsquo;s promise&mdash;the idea that learning Rust will make you a better programmer overall, regardless of whether you are using Rust or not.</p>
<p>Despite all the work we have put into our compiler error messages, this kind of detailed discussion is clearly something that we could never achieve. It&rsquo;s not because we don&rsquo;t want to! The original concept for <code>--explain</code>, for example, was to present a customized explanation of each error was tailored to the user&rsquo;s code. But we could never figure out how to implement that.</p>
<p><strong>And yet tailored, in-depth explanation is <em>absolutely</em> something an LLM could do.</strong> In fact, it&rsquo;s something they already do, at least some of the time&mdash;though in my experience the existing code assistants don&rsquo;t do nearly as good a job with Rust as they could.</p>
<h2 id="what-makes-a-good-ai-opportunity">What makes a good AI opportunity?</h2>
<p><a href="https://emeryberger.com">Emery Berger</a> is a professor at UMass Amherst who has been exploring how LLMs can improve the software development experience. Emery emphasizes how AI can help <strong>close the gap</strong> from &ldquo;tool to goal&rdquo;. In short, today&rsquo;s tools (error messages, debuggers, profilers) tell us things about our program, but they stop there. Except in simple cases, they can&rsquo;t help us figure out what to do about it&mdash;and this is where AI comes in.</p>
<p>When I say AI, I am not talking (just) about chatbots. I am talking about programs that weave LLMs into the process, using them to make heuristic choices or proffer explanations and guidance to the user. Modern LLMs can also do more than just rely on their training and the prompt: they can be given access to APIs that let them query and get up-to-date data.</p>
<p>I think AI will be most useful in cases where solving the problem requires external context not available within the program itself. Think back to my explanation of the <code>'static</code> error, where knowing the right answer depended on how easy/hard it would be to change other APIs.</p>
<h2 id="where-i-think-rust-should-leverage-ai">Where I think Rust should leverage AI</h2>
<p>I&rsquo;ve thought about a lot of places I think AI could help make working in Rust more pleasant. Here is a selection.</p>
<h3 id="deciding-whether-to-change-the-function-body-or-its-signature">Deciding whether to change the function body or its signature</h3>
<p>Consider this code:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">get_first_name</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">alias</span>: <span class="kp">&amp;</span><span class="kt">str</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kp">&amp;</span><span class="kt">str</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">alias</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This function will give a type error, because the signature (thanks to lifetime elision) promises to return a string borrowed from <code>self</code> but actually returns a string borrowed from <code>alias</code>. Now&hellip;what is the right fix? It&rsquo;s very hard to tell in isolation! It may be that in fact the code was meant to be <code>&amp;self.name</code> (in which case the current signature is correct). Or perhaps it was meant to be something that sometimes returns <code>&amp;self.name</code> and sometimes returns <code>alias</code>, in which case the signature of the function was wrong. Today, we take our best guess. But AI could help us offer more nuanced guidance.</p>
<h3 id="translating-idioms-from-one-language-to-another">Translating idioms from one language to another</h3>
<p>People often ask me questions like &ldquo;how do I make a visitor in Rust?&rdquo; The answer, of course, is &ldquo;it depends on what you are trying to do&rdquo;. Much of the time, a Java visitor is better implemented as a Rust enum and match statements, but there is a time and a place for something more like a visitor. Guiding folks through the decision tree for how to do non-trivial mappings is a great place for LLMs.</p>
<h3 id="figuring-out-the-right-type-structure">Figuring out the right type structure</h3>
<p>When I start writing a Rust program, I start by authoring type declarations. As I do this, I tend to think ahead to how I expect the data to be accessed. Am I going to need to iterate over one data structure while writing to another? Will I want to move this data to another thread? The setup of my structures will depend on the answer to these questions.</p>
<p>I think a lot of the frustration beginners feel comes from not having a &ldquo;feel&rdquo; yet for the right way to structure their programs. The structure they would use in Java or some other language often won&rsquo;t work in Rust.</p>
<p>I think an LLM-based assistant could help here by asking them some questions about the kinds of data they need and how it will be accessed. Based on this it could generate type definitions, or alter the definitions that exist.</p>
<h3 id="complex-refactorings-like-splitting-structs">Complex refactorings like splitting structs</h3>
<p>A follow-on to the previous point is that, in Rust, when your data access patterns change as a result of refactorings, it often means you need to do more wholesale updates to your code.<sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup> A common example for me is that I want to split out some of the fields of a struct into a substruct, so that they can be borrowed separately.<sup id="fnref:7"><a href="#fn:7" class="footnote-ref" role="doc-noteref">7</a></sup> This can be quite non-local and sometimes involves some heuristic choices, like &ldquo;should I move this method to be defined on the new substruct or keep it where it is?&rdquo;.</p>
<h3 id="migrating-consumers-over-a-breaking-change">Migrating consumers over a breaking change</h3>
<p>When you run the <code>cargo fix</code> command today it will automatically apply various code suggestions to cleanup your code. With the <a href="https://doc.rust-lang.org/nightly/edition-guide/rust-2024/index.html">upcoming Rust 2024 edition</a>, <code>cargo fix---edition</code> will do the same but for edition-related changes. All of the logic for these changes is hardcoded in the compiler and it can get a bit tricky.</p>
<p>For editions, we intentionally limit ourselves to local changes, so the coding for these migrations is usually not <em>too</em> bad, but there are some edge cases where it&rsquo;d be really useful to have heuristics. For example, <a href="https://doc.rust-lang.org/nightly/edition-guide/rust-2024/temporary-if-let-scope.html">one of the changes we are making in Rust 2024</a> affects &ldquo;temporary lifetimes&rdquo;. It can affect when destructors run. This almost never matters (your vector will get freed a bit earlier or whatever) but it <em>can</em> matter quite a bit, if the destructor happens to be a lock guard or something with side effects. In practice when I as a human work with changes like this, I can usually tell at a glance whether something is likely to be a problem&mdash;but the heuristics I use to make that judgment are a combination of knowing the name of the types involved, knowing something about the way the program works, and perhaps skimming the destructor code itself. We could hand-code these heuristics, but an LLM could do it and better, and if could ask questions if it was feeling unsure.</p>
<p>Now imagine you are releasing the 2.x version of your library. Maybe your API has changed in significant ways. Maybe one API call has been broken into two, and the right one to use depends a bit on what you are trying to do. Well, an LLM can help here, just like it can help in translating idioms from Java to Rust.</p>
<p>I imagine the idea of having an LLM help you migrate makes some folks uncomfortable. I get that. There&rsquo;s no reason it has to be mandatory&mdash;I expect we could always have a more limited, precise migration available.<sup id="fnref:8"><a href="#fn:8" class="footnote-ref" role="doc-noteref">8</a></sup></p>
<h3 id="optimize-your-rust-code-to-eliminate-hot-spots">Optimize your Rust code to eliminate hot spots</h3>
<p>Premature optimization is the root of all evil, or so Donald Knuth is said to have said. I&rsquo;m not sure about <em>all</em> evil, but I have definitely seen people rathole on microoptimizing a piece of code before they know if it&rsquo;s even expensive (or, for that matter, correct). This is doubly true in Rust, where cloning a small data structure (or reference counting it) can often make your life a lot simpler. Llogiq&rsquo;s great talks on <a href="https://llogiq.github.io/2024/03/28/easy.html">Easy Mode Rust</a> make exactly this point. But here&rsquo;s a question, suppose you&rsquo;ve been taking this advice to heart, inserting clones and the like, and you find that your program <em>is</em> running kind of slow? How do you make it faster? Or, even worse, suppose that you are trying to turn our network service. You are looking at the <a href="https://docs.rs/tokio-metrics/0.3.1/tokio_metrics/struct.TaskMetrics.html">blizzard of available metrics</a> and trying to figure out what changes to make. What do you do? To get some idea of what is possible, check out <a href="https://github.com/plasma-umass/scalene">Scalene</a>, a Python profiler that is also able to offer suggestions as well (from Emery Berger&rsquo;s group at UMass, the professor I talked about earlier).</p>
<h3 id="diagnose-and-explain-miri-and-sanitizer-errors">Diagnose and explain miri and sanitizer errors</h3>
<p>Let&rsquo;s look a bit to the future. I want us to get to a place where the &ldquo;minimum bar&rdquo; for writing unsafe code is that you test that unsafe code with some kind of sanitizer that checks for both C and Rust UB&mdash;something like miri today, except one that works &ldquo;at scale&rdquo; for code that invokes FFI or does other arbitrary things. I expect a smaller set of people will go further, leveraging automated reasoning tools like Kani or Verus to prove statically that their unsafe code is correct<sup id="fnref:9"><a href="#fn:9" class="footnote-ref" role="doc-noteref">9</a></sup>.</p>
<p>From my experience using miri today, I can tell you two things. (1) Every bit of unsafe code I write has some trivial bug or other. (2) If you enjoy puzzling out the occasionally inscrutable error messages you get from Rust, you&rsquo;re gonna <em>love</em> miri! To be fair, miri has a much harder job&mdash;the (still experimental) rules that govern Rust aliasing are intended to be flexible enough to allow all the things people want to do that the borrow checker doesn&rsquo;t permit. This means they are much more complex. It also means that explaining why you violated them (or may violate them) is that much more complicated.</p>
<p>Just as an AI can help novices understand the borrow checker, it can help advanced Rustaceans understand <a href="https://perso.crans.org/vanille/treebor/">tree borrows</a> (or whatever aliasing model we wind up adopting). And just as it can make smarter suggestions for whether to modify the function body or its signature, it can likely help you puzzle out a good fix.</p>
<h2 id="rusts-emphasis-on-reliability-makes-it-a-great-target-for-ai">Rust&rsquo;s emphasis on &ldquo;reliability&rdquo; makes it a great target for AI</h2>
<p>Anyone who has used an LLM-based tool has encountered hallucinations, where the AI just makes up APIs that &ldquo;seem like they ought to exist&rdquo;.<sup id="fnref:10"><a href="#fn:10" class="footnote-ref" role="doc-noteref">10</a></sup> And yet anyone who has used <em>Rust</em> knows that &ldquo;if it compiles, it works&rdquo; is true may more often than it has a right to be.<sup id="fnref:11"><a href="#fn:11" class="footnote-ref" role="doc-noteref">11</a></sup> This suggests to me that any attempt to use the Rust compiler to validate AI-generated code or solutions is going to also help ensure that the code is correct.</p>
<p>AI-based code assistants right now don&rsquo;t really have this property. I&rsquo;ve noticed that I kind of have to pick between &ldquo;shallow but correct&rdquo; or &ldquo;deep but hallucinating&rdquo;. A good example is <code>match</code> statements. I can use rust-analyzer to fill in the match arms and it will do a perfect job, but the body of each arm is <code>todo!</code>. Or I can let the LLM fill them in and it tends to cover most-but-not-all of the arms but it generates bodies. I would love to see us doing deeper integration, so that the tool is talking to the compiler to get perfect answers to questions like &ldquo;what variants does this enum have&rdquo; while leveraging the LLM for open-ended questions like &ldquo;what is the body of this arm&rdquo;.<sup id="fnref:12"><a href="#fn:12" class="footnote-ref" role="doc-noteref">12</a></sup></p>
<h2 id="conclusion">Conclusion</h2>
<p>Overall AI reminds me a lot of the web around the year 2000. It&rsquo;s clearly overhyped. It&rsquo;s clearly being used for all kinds of things where it is not needed. And it&rsquo;s clearly going to change everything.</p>
<p>If you want to see examples of what is possible, take a look at the <a href="https://github.com/plasma-umass/ChatDBG">ChatDBG</a> videos published by Emery Berger&rsquo;s group. You can see how the AI sends commands to the debugger to explore the program state before explaining the root cause. I love the video <a href="https://asciinema.org/a/qulxiJTqwVRJPaMZ1hcBs6Clu">debugging bootstrap.py</a>, as it shows the AI applying domain knowledge about statistics to debug and explain the problem.</p>
<p>My expectation is that compilers of the future will not contain nearly so much code geared around authoring diagnostics. They&rsquo;ll present the basic error, sure, but for more detailed explanations they&rsquo;ll turn to AI. It won&rsquo;t be just a plain old foundation model, they&rsquo;ll use RAG techniques and APIs to let the AI query the compiler state, digest what it finds, and explain it to users. Like a good human tutor, the AI will tailor its explanations to the user, leveraging the user&rsquo;s past experience and intuitions (oh, and in the user&rsquo;s chosen language).</p>
<p>I am aware that AI has some serious downsides. The most serious to me is its prodigous energy use, but there are also good questions to be asked about the way that training works and the possibility of not respecting licenses. The issues are real but avoiding AI is not the way to solve them. Just in the course of writing this post, DeepSeek was announced, demonstrating that there is a lot of potential to lower the costs of training. As far as the ethics and legality, that is a very complex space. Agents are already doing a lot to get better there, but note also that most of the applications I am excited about do not involve writing code so much as helping people understand and alter the code they&rsquo;ve written.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>We don&rsquo;t always get this right. For example, I find the <code>zip</code> combinator of iterators annoying because it takes the shortest of the two iterators, which is occasionally nice but far more often hides bugs.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>The irony, of course, is that AI can help you to improve your woeful lack of tests by auto-generating them based on code coverage and current behavior.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>I think they told me they heard it somewhere on the internet? Not sure the original source.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>Personally, the thing I find most annoying about LLMs is the way they are trained to respond like groveling serveants. &ldquo;Oh, that&rsquo;s a good idea! Let me help you with that&rdquo; or &ldquo;I&rsquo;m sorry, you&rsquo;re right I did make a mistake, here is a version that is better&rdquo;. Come on, I don&rsquo;t need flattery. The idea is fine but I&rsquo;m aware it&rsquo;s not earth-shattering. Just help me already.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>Inserting a call to <code>clone</code> is actually a bit more subtle than you might think, given the interaction of the <code>async</code> future here.&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p>Garbage Collection allows you to make all kinds of refactorings in ownership structure without changing your interface at all. This is convenient, but&mdash;as we discussed early on&mdash;it can hide bugs. Overall I prefer having that information be explicit in the interface, but that comes with the downside that changes have to be refactored.&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:7">
<p>I also think we should add a feature like <a href="https://smallcultfollowing.com/babysteps/
/blog/2021/11/05/view-types/">View Types</a> to make this less necessary. In this case instead of refactoring the type structure, AI could help by generating the correct type annotations, which might be non-obvious.&#160;<a href="#fnref:7" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:8">
<p>My hot take here is that if the idea of an LLM doing migrations in your code makes you uncomfortable, you are likely (a) overestimating the quality of your code and (b) underinvesting in tests and QA infrastructure<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>. I tend to view an LLM like a &ldquo;inconsistently talented contributor&rdquo;, and I am perfectly happy having contributors hack away on projects I own.&#160;<a href="#fnref:8" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:9">
<p>The student asks, &ldquo;When unsafe code is proven free of UB, does that make it safe?&rdquo; The master says, &ldquo;Yes.&rdquo; The student asks, &ldquo;And is it then still unsafe?&rdquo; The master says, &ldquo;Yes.&rdquo; Then, a minute later, &ldquo;Well, sort of.&rdquo; (We may need new vocabulary.)&#160;<a href="#fnref:9" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:10">
<p>My personal favorite story of this is when I asked ChatGPT to generate me a list of &ldquo;real words and their true definition along with 2 or 3 humorous fake definitions&rdquo; for use in a birthday party game. I told it that &ldquo;I know you like to hallucinate so please include links where I can verify the real definition&rdquo;. It generated a great list of words along with plausible looking URLs for merriamwebster.com and so forth&mdash;but when I clicked the URLs, they turned out to all be 404s (the words, it turned out, were real&mdash;just not the URLs).&#160;<a href="#fnref:10" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:11">
<p>This is not a unique property of Rust, it is shared by other languages with rich type systems, like Haskell or ML. Rust happens to be the most widespread such language.&#160;<a href="#fnref:11" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:12">
<p>I&rsquo;d also like it if the LLM could be a bit less interrupt-y sometimes. Especially when I&rsquo;m writing type-system code or similar things, it can be distracting when it keeps trying to author stuff it clearly doesn&rsquo;t understand. I expect this too will improve over time&mdash;and I&rsquo;ve noticed that while, in the beginning, it tends to guess very wrong, over time it tends to guess better. I&rsquo;m not sure what inputs and context are being fed by the LLM in the background but it&rsquo;s evident that it can come to see patterns even for relatively subtle things.&#160;<a href="#fnref:12" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><title type="html">Preview crates</title><link href="https://smallcultfollowing.com/babysteps/blog/2025/01/29/preview-crates/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2025/01/29/preview-crates/</id><published>2025-01-29T00:00:00+00:00</published><updated>2025-01-29T22:26:31+00:00</updated><content type="html"><![CDATA[<p>This post lays out the idea of <em>preview crates</em>.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> Preview crates would be special crates released by the rust-lang org. Like the standard library, preview crates would have access to compiler internals but would still be usable from stable Rust. They would be used in cases where we know we want to give users the ability to do X but we don&rsquo;t yet know precisely how we want to expose it in the language or stdlib. In git terms, preview crates would let us stabilize the <a href="https://git-scm.com/book/en/v2/Git-Internals-Plumbing-and-Porcelain"><em>plumbing</em></a> while retaining the ability to iterate on the final shape of the <a href="https://git-scm.com/book/en/v2/Git-Internals-Plumbing-and-Porcelain"><em>porcelain</em></a>.</p>
<h2 id="nightly-is-not-enough">Nightly is not enough</h2>
<p>Developing large language features is a tricky business. Because everything builds on the language, stability is very important, but at the same time, there are some questions that are very hard to answer without experience. Our main tool for getting this experience has been the nightly toolchain, which lets us develop, iterate, and test features before committing to them.</p>
<p>Because the nightly toolchain comes with no guarantees at all, however, most users who experiment with it do so lightly, just using it for toy projects and the like. For some features, this is perfectly fine, particularly syntactic features like <code>let-else</code>, where you can learn everything you need to know about how it feels from a single crate.</p>
<h2 id="nightly-doesnt-let-you-build-a-fledgling-ecosystem">Nightly doesn&rsquo;t let you build a fledgling ecosystem</h2>
<p>Where nightly really fails us though is the ability to estimate the impact of a feature on a larger ecosystem. Sometimes you would like to expose a capability and see what people build with it. How do they use it? What patterns emerge? Often, we can predict those patterns in advance, but sometimes there are surprises, and we find that what we thought would be the default mode of operation is actually kind of a niche case.</p>
<p>For these cases, it would be cool if there were a way to issue a feature in &ldquo;preview&rdquo; mode, where people can build on it, but it is not yet released in its final form. The challenge is that if we want people to use this to build up an ecosystem, we don&rsquo;t want to disturb all those crates when we iterate on the feature. We want a way to make changes that lets those crates keep working until the maintainers have time to port to the latest syntax, naming, or whatever.</p>
<h2 id="editions-are-closer-but-not-quite-right">Editions are closer, but not quite right</h2>
<p>The other tool we have for correct mistakes is <a href="https://doc.rust-lang.org/edition-guide/editions/">editions</a>. Editions let us change what syntax means and, because they are opt-in, all existing code continues to work.</p>
<p>Editions let us fix a great many things to make Rust more self-consistent, but they carry a heavy cost. They force people to relearn how things in Rust work. The make books oudated. This price is typically too high for us to ship a feature <em>knowing</em> that we are going to change it in a future edition.</p>
<h2 id="lets-give-an-example">Let&rsquo;s give an example</h2>
<p>To make this concrete, let&rsquo;s take a specific example. The const generics team has been hard at work iterating on the meaning of <code>const trait</code> and in fact there is a <a href="https://github.com/rust-lang/rfcs/pull/3762">pending RFC</a> that describes their work. There&rsquo;s just one problem: it&rsquo;s not yet clear how it should be exposed to users. I won&rsquo;t go into the rationale for each choice, but suffice to say that there are a number of options under current consideration. All of these examples have been proposed, for example, as the way to say &ldquo;a function that can be executed at compilation time which will call <code>T::default</code>&rdquo;:</p>
<ul>
<li><code>const fn compute_value&lt;T: ~const Default&gt;()</code></li>
<li><code>const fn compute_value&lt;T: const Default&gt;()</code></li>
<li><code>const fn compute_value&lt;T: Default&gt;()</code></li>
</ul>
<p>At the moment, I personally have a preference between these (I&rsquo;ll let you guess), but I figure I have about&hellip; hmm&hellip; 80-90% confidence in that choice. And what&rsquo;s worse, to really decide between them, I think we have to see how the work on async proceeds, and perhaps also what kinds of patterns turn out to be common in practice for <code>const fn</code>. This stuff is difficult to gauge accurately in advance.</p>
<h2 id="enter-preview-crates">Enter preview crates</h2>
<p>So what if we released a crate <code>rust_lang::const_preview</code>. In my dream world, this is released on crates.io, using the namespaces described in [RFC #3243][https://rust-lang.github.io/rfcs/3243-packages-as-optional-namespaces.html]. Like any crate, <code>const_preview</code> can be versioned. It would expose exactly one item, a macro <code>const_item</code> that can be used to write const functions that have const trait bounds:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">const_preview</span>::<span class="fm">const_item!</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">const</span><span class="w"> </span><span class="k">fn</span> <span class="nf">compute_value</span><span class="o">&lt;</span><span class="n">T</span>: <span class="o">~</span><span class="k">const</span><span class="w"> </span><span class="nb">Default</span><span class="o">&gt;</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// as `~const` is what is implemented today, I&#39;ll use it in this example
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Interally, this <code>const_item!</code> macro can make use of internal APIs in the compiler to parse the contents and deploy the special semantics.</p>
<h3 id="releasing-v20">Releasing v2.0</h3>
<p>Now, maybe we use this for a while, and we find that people really don&rsquo;t like the <code>~</code>, so we decide to change the syntax. Perhaps we opt to write <code>const Default</code> instead of <code>~const Default</code>. No problem, we release a 2.0 version of the crate and we also rewrite 1.0 to take in the tokens and invoke 2.0 using the <a href="https://github.com/dtolnay/semver-trick">semver trick</a>.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">const_preview</span>::<span class="fm">const_item!</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">const</span><span class="w"> </span><span class="k">fn</span> <span class="nf">compute_value</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">const</span><span class="w"> </span><span class="nb">Default</span><span class="o">&gt;</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// as `~const` is what is implemented today, I&#39;ll use it in this example
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="integrating-into-the-language">Integrating into the language</h3>
<p>Once we decide we are happy with <code>const_item!</code> we can merge it into the language proper. The preview crates are deprecated and simply desugar to the true language syntax. We all go home, drink non-fat flat whites, and pat ourselves on the back.</p>
<h2 id="user-based-experimentation">User-based experimentation</h2>
<p>One thing I like about the preview crates is that then others can begin to do their own experiments. Perhaps somebody wants to try out what it would be like it <code>T: Default</code> meant <code>const</code> by default&ndash;they can readily write a wrapper that desugars to <code>const_preview::const_item</code> and try it out. And people can build on it. And all that code keeps working once we integrate const functions into the language &ldquo;for real&rdquo;, it just looks kinda dated.</p>
<h2 id="frequently-asked-questions">Frequently asked questions</h2>
<h3 id="why-else-might-we-use-previews">Why else might we use previews?</h3>
<p>Even if we know the semantics, we could use previews to stabilize features where the user experience is not great. I&rsquo;m thinking of Generic Associated Types as one example, where the stabilization was slowed because of usability concerns.</p>
<h3 id="what-are-the-risks-from-this">What are the risks from this?</h3>
<p>The previous answers hints at one of my fears&hellip; if preview crates become a widespread way for us to stabilize features with usability gaps, we may accumulate a very large number of them and then never move those features into Rust proper. That seems bad.</p>
<h3 id="shouldnt-we-just-make-a-decision-already">Shouldn&rsquo;t we just make a decision already?</h3>
<p>I mean&hellip;maybe? I do think we are sometimes very cautious. I would like us to get better at leaning on our judgment. But I also seem that sometimes there is a tension between &ldquo;getting something out the door&rdquo; and &ldquo;taking the time to evaluate a generalization&rdquo;, and it&rsquo;s not clear to me that this tension is an inherent complexity or an artificial artifact of the way we do business.</p>
<h3 id="but-would-this-actually-work-whats-in-that-crate-and-what-if-it-is-not-matched-with-the-right-version-of-the-compiler">But would this actually work? What&rsquo;s in that crate and what if it is not matched with the right version of the compiler?</h3>
<p>One very special thing about libstd is that it is released together with the compiler and hence it is able to co-evolve, making use of internal APIs that are unstable and change from release to release. If we want to put this crate on crates.io, it will not be able to co-evolve in the same way. Bah. That&rsquo;s annoying! But I figure we still handle it by <em>actually</em> having the preview functionality exposed by crates in sysroot that are shipping along the compiler. These crates would not be directly usable except by our blessed crates.io crates, but they would basically just be shims that expose the underlying stuff. We could of course cut out the middleman and just have people use those preview crates directly&ndash; but I don&rsquo;t like that as much because it&rsquo;s less obvious and because we can&rsquo;t as easily track reverse dependencies on crates.io to evaluate usage.</p>
<h3 id="a-macro-seems-heavy-weight-what-other-options-have-you-considered">A macro seems heavy weight! What other options have you considered?</h3>
<p>I also considered the idea of having <code>p#</code> keywords (&ldquo;preview&rdquo;), so e.g.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cp">#[allow(preview_feature)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">p</span>#<span class="k">const</span><span class="w"> </span><span class="k">fn</span> <span class="nf">compute_value</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">p</span>#<span class="k">const</span><span class="w"> </span><span class="nb">Default</span><span class="o">&gt;</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// works on stable
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Using a <code>p#</code> keyword would fire off a lint (<code>preview_feature</code>) that you would probably want to <code>allow</code>.</p>
<p>This is less intrusive, but I like the crate idea better because it allows us to release a v2.0 of the <code>p#const</code> keyword.</p>
<h3 id="what-kinds-of-things-can-we-use-preview-crates-for">What kinds of things can we use preview crates for?</h3>
<p>Good question. I&rsquo;m not entirely sure. It seems like APIs that require us to define new traits and other things would be a bit tricky to maintain the total interoperability I think we want. Tools like trait aliases etc (which we need for other reasons) would help.</p>
<h3 id="who-else-does-this-sort-of-thing">Who else does this sort of thing?</h3>
<p><em>Ember</em> has formalized this &ldquo;plumbing first&rdquo; approach in <a href="https://emberjs.com/editions/">their version of editions</a>. In Ember, from what I understand, an edition is not a &ldquo;time-based thing&rdquo;, like in Rust. Instead, it indicates a big shift in paradigms, and it comes out when that new paradigm is ready. But part of the process to reaching an edition is to start by shipping core APIs (plumbing APIs) that create the new capabilities. The community can then create wrappers and experiment with the &ldquo;porcelain&rdquo; before the Ember crate enshrines a best practice set of APIs and declares the new Edition ready.</p>
<p><em>Java</em> has a notion of preview features, but they are not semver guaranteed to stick around.</p>
<p>I&rsquo;m not sure who else!</p>
<h3 id="could-we-use-decorators-instead">Could we use decorators instead?</h3>
<p>Usability of decorators like <code>#p[const_preview::const_item]</code> is better, particularly in rust-analyzer. The tricky bit there is that decorates can only be applied to valid Rust syntax, so it implies we&rsquo;d need to extend the parser to include things like <code>~const</code> forever, whereas I might prefer to have that complexity isolated to the <code>const_preview</code> crate.</p>
<h3 id="so-is-this-a-done-deal-is-this-happening">So is this a done deal? Is this happening?</h3>
<p>I don&rsquo;t know! People often think that because I write a blog post about something it will happen, but this is currently just in &ldquo;early ideation&rdquo; stage. As I&rsquo;ve written before, though, I continue to feel that we need something kind of &ldquo;middle state&rdquo; for our release process (see e.g. this blog post, <a href="https://smallcultfollowing.com/babysteps/
/blog/2023/09/18/stability-without-stressing-the-out/"><em>Stability without stressing the !@#! out</em></a>), and I think preview crates could be a good tool to have in our toolbox.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Hat tip to Yehuda Katz and the Ember community, Tyler Mandry, Jack Huey, Josh Triplett, Oli Scherer, and probably a few others I&rsquo;ve forgotten with whom I discussed this idea. Of course anything you like, they came up with, everything you hate was my addition.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><title type="html">MinPin: yet another pin proposal</title><link href="https://smallcultfollowing.com/babysteps/blog/2024/11/05/minpin/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2024/11/05/minpin/</id><published>2024-11-05T00:00:00+00:00</published><updated>2024-11-05T17:20:18+00:00</updated><content type="html"><![CDATA[<p>This post floats a variation of boats&rsquo; <a href="https://without.boats/blog/unpin-cell/">UnpinCell</a> proposal that I&rsquo;m calling <em>MinPin</em>.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> MinPin&rsquo;s goal is to integrate <code>Pin</code> into the language in a &ldquo;minimally disruptive&rdquo; way<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> &ndash; and in particular a way that is fully backwards compatible. Unlike <code>Overwrite</code>, MinPin does not attempt to make <code>Pin</code> and <code>&amp;mut</code> &ldquo;play nicely&rdquo; together. It does however leave the door open to add <code>Overwrite</code> in the future, and I think helps to clarify the positives and negatives that <code>Overwrite</code> would bring.</p>
<h2 id="tldr-key-design-decisions">TL;DR: Key design decisions</h2>
<p>Here is a brief summary of MinPin&rsquo;s rules</p>
<ul>
<li>The <code>pinned</code> keyword can be used to get pinned variations of things:
<ul>
<li>In types, <code>pinned P</code> is equivalent to <code>Pin&lt;P&gt;</code>, so <code>pinned &amp;mut T</code> and <code>pinned Box&lt;T&gt;</code> are equivalent to <code>Pin&lt;&amp;mut T&gt;</code> and <code>Pin&lt;Box&lt;T&gt;&gt;</code> respectively.</li>
<li>In function signatures, <code>pinned &amp;mut self</code> can be used instead of <code>self: Pin&lt;&amp;mut Self&gt;</code>.</li>
<li>In expressions, <code>pinned &amp;mut $place</code> is used to get a <code>pinned &amp;mut</code> that refers to the value in <code>$place</code>.</li>
</ul>
</li>
<li>The <code>Drop</code> trait is modified to have <code>fn drop(pinned &amp;mut self)</code> instead of <code>fn drop(&amp;mut self)</code>.
<ul>
<li>However, impls of <code>Drop</code> are still permitted (even encouraged!) to use <code>fn drop(&amp;mut self)</code>, but it means that your type will not be able to use (safe) pin-projection. For many types that is not an issue; for futures or other &ldquo;address sensitive&rdquo; types, you should use <code>fn drop(pinned &amp;mut self)</code>.</li>
</ul>
</li>
<li>The rules for field projection from a <code>s: pinned &amp;mut S</code> reference are based on whether or not <code>Unpin</code> is implemented:
<ul>
<li>Projection is always allowed for fields whose type implements <code>Unpin</code>.</li>
<li>For fields whose types are not known to implement <code>Unpin</code>:
<ul>
<li>If the struct <code>S</code> is <code>Unpin</code>, <code>&amp;mut</code> projection is allowed but not <code>pinned &amp;mut</code>.</li>
<li>If the struct <code>S</code> is <code>!Unpin</code>[^neg] and does not have a <code>fn drop(&amp;mut self)</code> method, <code>pinned &amp;mut</code> projection is allowed but not <code>&amp;mut</code>.</li>
<li>If the type checker does not know whether <code>S</code> is <code>Unpin</code> or not, or if the type <code>S</code> has a <code>Drop</code> impl with <code>fn drop(&amp;mut self)</code>, neither form of projection is allowed for fields that are not <code>Unpin</code>.</li>
</ul>
</li>
</ul>
</li>
<li>There is a type <code>struct Unpinnable&lt;T&gt; { value: T }</code> that always implements <code>Unpin</code>.</li>
</ul>
<h2 id="design-axioms">Design axioms</h2>
<p>Before I go further I want to layout some of my design axioms (beliefs that motivate and justify my design).</p>
<ul>
<li><strong><code>Pin</code> is part of the Rust language.</strong> Despite Pin being entirely a &ldquo;library-based&rdquo; abstraction at present, it is very much a part of the language semantics, and it deserves first-class support. It should be possible to create pinned references and do pin projections in safe Rust.</li>
<li><strong><code>Pin</code> is its own world.</strong> Pin is only relevant in specific use cases, like futures or in-place linked lists.</li>
<li><strong><code>Pin</code> should have zero-conceptual-cost.</strong> Unless you are writing a <code>Pin</code>-using abstraction, you shouldn&rsquo;t have to know or think about pin at all.</li>
<li><strong>Explicit is possible.</strong> Automatic operations are nice but it should always be possible to write operations explicitly when needed.</li>
<li><strong>Backwards compatible.</strong> Existing code should continue to compile and work.</li>
</ul>
<h2 id="frequently-asked-questions">Frequently asked questions</h2>
<p>For the rest of the post I&rsquo;m just going to go into FAQ mode.</p>
<h3 id="i-see-the-rules-but-can-you-summarize-how-minpin-would-feel-to-use">I see the rules, but can you summarize how MinPin would feel to <em>use</em>?</h3>
<p>Yes. I think the rule of thumb would be this. For any given type, you should decide whether your type <em>cares</em> about pinning or not.</p>
<p>Most types do not care about pinning. They just go on using <code>&amp;self</code> and <code>&amp;mut self</code> as normal. Everything works as today (this is the &ldquo;zero-conceptual-cost&rdquo; goal).</p>
<p>But <em>some</em> types <em>do</em> care about pinning. These are typically future implementations but they could be other special case things. In that case, you should explicitly implement <code>!Unpin</code> to declare yourself as pinnable. When you declare your methods, you have to make a choice</p>
<ul>
<li>Is the method read-only? Then use <code>&amp;self</code>, that always works.</li>
<li>Otherwise, use <code>&amp;mut self</code> or <code>pinned &amp;mut self</code>, depending&hellip;
<ul>
<li>If the method is meant to be called before pinning, use <code>&amp;mut self</code>.</li>
<li>If the method is meant to be called after pinning, use <code>pinned &amp;mut self</code>.</li>
</ul>
</li>
</ul>
<p>This design works well so long as all mutating methods can be categorized into before-or-after pinning. If you have methods that need to be used in both settings, you have to start using workarounds &ndash; in the limit, you make two copies.</p>
<h3 id="how-does-minpin-compare-to-unpincell">How does MinPin compare to UnpinCell?</h3>
<p>Those of you who have been following the various posts in this area will recognize many elements from boats&rsquo; recent <a href="https://without.boats/blog/unpin-cell/">UnpinCell</a>. While the proposals share many elements, there is also one big difference between them that makes a big difference in how they would feel when used. Which is overall better is not yet clear to me.</p>
<p>Let&rsquo;s start with what they have in common. Both propose syntax for pinned references/borrows (albeit slightly different syntax) and both include a type for &ldquo;opting out&rdquo; from pinning (the eponymous <code>UnpinCell&lt;T&gt;</code> in <a href="https://without.boats/blog/unpin-cell/">UnpinCell</a>, <code>Unpinnable&lt;T&gt;</code> in MinPin). Both also have a similar &ldquo;special case&rdquo; around <code>Drop</code> in which writing a drop impl with <code>fn drop(&amp;mut self)</code> disables safe pin-projection.</p>
<p>Where they differ is how they manage generic structs like <code>WrapFuture&lt;F&gt;</code>, where it is not known whether or not they are <code>Unpin</code>.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">WrapFuture</span><span class="o">&lt;</span><span class="n">F</span>: <span class="nc">Future</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">future</span>: <span class="nc">F</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The <code>r: pinned &amp;mut WrapFuture&lt;F&gt;</code>, the question is whether we can project the field <code>future</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">F</span>: <span class="nc">Future</span><span class="o">&gt;</span><span class="w"> </span><span class="n">WrapFuture</span><span class="o">&lt;</span><span class="n">F</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">method</span><span class="p">(</span><span class="n">pinned</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">f</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">pinned</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">r</span><span class="p">.</span><span class="n">future</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//      --------------------
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//      Is this allowed?
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>There is a specific danger case that both sets of rules are trying to avoid. Imagine that <code>WrapFuture&lt;F&gt;</code> implements <code>Unpin</code> but <code>F</code> does not &ndash; e.g., imagine that you have a <code>impl&lt;F: Future&gt; Unpin for WrapFuture&lt;F&gt;</code>. In that case, the referent of the <code>pinned &amp;mut WrapFuture&lt;F&gt;</code> reference is not actually pinned, because the type is unpinnable. If we permitted the creation of a <code>pinned &amp;mut F</code>, where <code>F: !Unpin</code>, we would be under the (mistaken) impression that <code>F</code> is pinned. Bad.</p>
<p><a href="https://without.boats/blog/unpin-cell/">UnpinCell</a> handles this case by saying that projecting from a <code>pinned &amp;mut</code> is only allowed so long as there is no explicit impl of <code>Unpin</code> for <code>WrapFuture</code> (&ldquo;if [WrapFuture&lt;F&gt;] implements <code>Unpin</code>, it does so using the auto-trait mechanism, not a manually written impl&rdquo;). Basically: if the user doesn&rsquo;t say whether the type is <code>Unpin</code> or not, then you can do pin-projection. The idea is that <em>if</em> the self type is <code>Unpin</code>, that will only be because all fields are unpin (in which case it is fine to make <code>pinned &amp;mut</code> references to them); <em>if</em> the self type is <em>not</em> <code>Unpin</code>, then the field <code>future</code> is pinned, so it is safe.</p>
<p>In contrast, in MinPin, this case is only allowed if there is an explicit <code>!Unpin</code> impl for <code>WrapFuture</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">F</span>: <span class="nc">Future</span><span class="o">&gt;</span><span class="w"> </span><span class="o">!</span><span class="nb">Unpin</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">WrapFuture</span><span class="o">&lt;</span><span class="n">F</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// This impl is required in MinPin, but not in UnpinCell
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Explicit negative impls are not allowed on stable, but they were included in the original auto trait RFC. The idea is that a negative impl is an explicit, semver-binding commitment <em>not</em> to implement a trait. This is different from simply not including an impl at all, which allows for impls to be added later.</p>
<h3 id="why-would-you-prefer-minpin-over-unpincell-or-vice-versa">Why would you prefer MinPin over UnpinCell or vice versa?</h3>
<p>I&rsquo;m not totally sure which of these is better. I came to the <code>!Unpin</code> impl based on my axiom that <strong>pin is its own world</strong> &ndash; the idea was that it was better to push types to be explicitly unpin all the time than to have &ldquo;dual-mode&rdquo; types that masquerade as sometimes pinned and sometimes not.</p>
<p>In general I feel like it&rsquo;s better to justify language rules by the presence of a declaration than the absence of one. So I don&rsquo;t like the idea of saying &ldquo;the absence of an <code>Unpin</code> impl allows for pin-projection&rdquo; &ndash; after all, adding impls is supposed to be semver-compliant. Of course, that&rsquo;s much lesss true for auto traits, but it can still be true.</p>
<p>In fact, <code>Pin</code> has <a href="https://github.com/rust-lang/rust/issues/66544">had some unsoundness</a> in the past based on unsafe reasoning that was justified by the <strong>lack</strong> of an impl. We assumed that <code>&amp;T</code> could never implemented <code>DerefMut</code>, but it turned out to be possible to add weird impls of <code>DerefMut</code> in very specific cases. We fixed this by <a href="https://github.com/rust-lang/rust/pull/68004">adding an explicit <code>impl&lt;T&gt; !DerefMut for &amp;T</code> impl</a>.</p>
<p>On the other hand, I can imagine that many explicitly implemented futures might benefit from being able to be ambiguous about whether they are <code>Unpin</code>.</p>
<h3 id="what-does-your-design-axiom-pin-is-its-own-world-mean">What does your design axiom &ldquo;<code>Pin</code> is its own world&rdquo; mean?</h3>
<p>The way I see it is that, in Rust today (and in MinPin, pinned places, UnpinCell, etc), if you have a <code>T: !Unpin</code> type (that is, a type that is pinnable), it lives a double life. Initially, it is unpinned, and you interact can move it, <code>&amp;</code>-ref it, or <code>&amp;mut</code>-ref it, just like any other Rust value. But once a <code>!Unpin</code> value becomes pinned to a place, it enters a different state, in which you can no longer move it or use <code>&amp;mut</code>, you have to use <code>pinned &amp;mut</code>:</p>
<pre class="mermaid">flowchart TD
Unpinned[
    Unpinned: can access 'v' with '&amp;' and '&amp;mut'
]

Pinned[
    Pinned: can access 'v' with '&amp;' and 'pinned &amp;mut'
]

Unpinned --
    pin 'v' in place (only if T is '!Unpin')
--> Pinned
  </pre>
<p>One-way transitions like this limit the amount of interop and composability you get in the language. For example, if my type has <code>&amp;mut</code> methods, I can&rsquo;t use them once the type is pinned, and I have to use some workaround, such as duplicating the method with <code>pinned &amp;mut</code>.<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup> In this specific case, however, I don&rsquo;t think this transition is so painful, and that&rsquo;s because of the specifics of the domain: futures go through a pretty hard state change where they start in &ldquo;preparation mode&rdquo; and then eventually start executing. The set of methods you need at these two phases are quite distinct. So this is what I meant by &ldquo;pin is its own world&rdquo;: pin is not very interopable with Rust, but this is not as bad as it sounds, because you don&rsquo;t often need that kind of interoperability.</p>
<h3 id="how-would-overwrite-affect-pin-being-in-its-own-world">How would <code>Overwrite</code> affect pin being in its own world?</h3>
<p>With <code>Overwrite</code>, when you pin a value in place, you just gain the ability to use <code>pinned &amp;mut</code>, you don&rsquo;t give up the ability to use <code>&amp;mut</code>:</p>
<pre class="mermaid">flowchart TD
Unpinned[
    Unpinned: can access 'v' with '&amp;' and '&amp;mut'
]

Pinned[
    Pinned: can additionally access 'v' with 'pinned &amp;mut'
]

Unpinned --
    pin 'v' in place (only if T is '!Unpin')
--> Pinned
  </pre>
<p>Making pinning into a &ldquo;superset&rdquo; of the capabilities of pinned means that <code>pinned &amp;mut</code> can be coerced into an <code>&amp;mut</code> (it could even be a &ldquo;true subtype&rdquo;, in Rust terms). This in turn means that a <code>pinned &amp;mut Self</code> method can invoke <code>&amp;mut self</code> methods, which helps to make pin feel like a smoothly integrated part of the language.<sup id="fnref1:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></p>
<h3 id="so-does-the-axiom-mean-you-think-overwrite-is-a-bad-idea">So does the axiom mean you think Overwrite is a bad idea?</h3>
<p>Not exactly, but I do think that if <code>Overwrite</code> is justified, it is not on the basis of <code>Pin</code>, it is on the basis of <a href="https://smallcultfollowing.com/babysteps/blog/2024/09/26/overwrite-trait/#motivating-example-1-immutable-fields">immutable fields</a>. If you just look at <code>Pin</code>, then <code>Overwrite</code> does make <code>Pin</code> work better, but it does that by limiting the capabilities of <code>&amp;mut</code> to those that are compatible with <code>Pin</code>. There is no free lunch! As Eric Holk memorably put it to me in privmsg:</p>
<blockquote>
<p>It seems like there&rsquo;s a fixed amount of inherent complexity to pinning, but it&rsquo;s up to us how we distribute it. Pin keeps it concentrated in a small area which makes it seem absolutely terrible, because you have to face the whole horror at once.<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup></p>
</blockquote>
<p>I think <code>Pin</code> as designed is a &ldquo;zero-conceptual-cost&rdquo; abstraction, meaning that if you are not trying to use it, you don&rsquo;t really have to care about it. That&rsquo;s worth maintaining, if we can. If we are going to limit what <code>&amp;mut</code> can do, the reason to do it is primarily to get other benefits, not to benefit pin code specifically.</p>
<p>To be clear, this is largely a function of where we are in Rust&rsquo;s evolution. If we were still in the early days of Rust, I would say <code>Overwrite</code> is the correct call. It reminds me very much of the <a href="https://smallcultfollowing.com/babysteps/blog/2012/11/18/imagine-never-hearing-the-phrase-aliasable/">IMHTWAMA</a>, the core &ldquo;mutability xor sharing&rdquo; rule at the heart of Rust&rsquo;s borrow checker. When we decided to adopt the current borrow checker rules, the code was about 85-95% in conformance. That is, although there was plenty of aliased mutation, it was clear that &ldquo;mutability xor sharing&rdquo; was capturing a rule that we already <em>mostly</em> followed, but not completely. Because combining aliased state with memory safety is more complicated, that meant that a small minority of code was pushing complexity onto the entire language. Confining shared mutation to types like <code>Cell</code> and <code>Mutex</code> made <em>most code</em> simpler at the cost of more complexity around shared state in particular.</p>
<p>There&rsquo;s a similar dynamic around replace and swap. Replace and swap are only used in a few isolated places and in a few particular ways, but the all code has to be more conservative to account for that possibility. If we could go back, I think limiting <code>Replace</code> to some kind of <code>Replaceable&lt;T&gt;</code> type would be a good move, because it would mean that the more common case can enjoy the benefits: fewer borrow check errors and more precise programs due to <a href="https://smallcultfollowing.com/babysteps/blog/2024/09/26/overwrite-trait/#motivating-example-1-immutable-fields">immutable fields</a> and the ability to pass an <code>&amp;mut SomeType</code> and be sure that your callee is not swapping the value under your feet (useful for the <a href="https://smallcultfollowing.com/babysteps/blog/2024/10/14/overwrite-and-pin/#what-does-it-mean-to-be-the-same-value">&ldquo;scope pattern&rdquo;</a> and also enables <code>Pin&lt;&amp;mut&gt;</code> to be a subtype of <code>&amp;mut</code>).</p>
<h3 id="why-did-you-adopt-pinned-mut-and-not-pin-mut-as-the-syntax">Why did you adopt <code>pinned &amp;mut</code> and not <code>&amp;pin mut</code> as the syntax?</h3>
<p>The main reason was that I wanted a syntax that scaled to <code>Pin&lt;Box&lt;T&gt;&gt;</code>. But also the <code>pin!</code> macro exists, making the <code>pin</code> keyword somewhat awkward (though not impossible).</p>
<p>One thing I was wondering about is the phrase &ldquo;pinned reference&rdquo; or &ldquo;pinned pointer&rdquo;. On the one hand, it is really a <em>reference to a pinned value</em> (which suggests <code>&amp;pin mut</code>). On the other hand, I think this kind of ambiguity is pretty common. The main thing I have found is that my brain has trouble with <code>Pin&lt;P&gt;</code> because it wants to think of <code>Pin</code> as a &ldquo;smart pointer&rdquo; versus a modifier on <em>another</em> smart pointer. <code>pinned Box&lt;T&gt;</code> feels much better this way.</p>
<h3 id="can-you-show-me-an-example-what-about-the-maybedone-example">Can you show me an example? What about the <code>MaybeDone</code> example?</h3>
<p>Yeah, totally. So boats <a href="https://without.boats/blog/pinned-places/">pinned places</a> post introduced two futures, <code>MaybeDone</code> and <code>Join</code>. Here is how <code>MaybeDone</code> would look in MinPin, along with some inline comments:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">enum</span> <span class="nc">MaybeDone</span><span class="o">&lt;</span><span class="n">F</span>: <span class="nc">Future</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Polling</span><span class="p">(</span><span class="n">F</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Done</span><span class="p">(</span><span class="n">Unpinnable</span><span class="o">&lt;</span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">F</span>::<span class="n">Output</span><span class="o">&gt;&gt;</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//   ---------- see below
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">F</span>: <span class="nc">Future</span><span class="o">&gt;</span><span class="w"> </span><span class="o">!</span><span class="nb">Unpin</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">MaybeDone</span><span class="o">&lt;</span><span class="n">F</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">//              -----------------------
</span></span></span><span class="line"><span class="cl"><span class="c1">//
</span></span></span><span class="line"><span class="cl"><span class="c1">// `MaybeDone` is address-sensitive, so we
</span></span></span><span class="line"><span class="cl"><span class="c1">// opt out from `Unpin` explicitly. I assumed
</span></span></span><span class="line"><span class="cl"><span class="c1">// opting out from `Unpin` was the *default* in
</span></span></span><span class="line"><span class="cl"><span class="c1">// my other posts.
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">F</span>: <span class="nc">Future</span><span class="o">&gt;</span><span class="w"> </span><span class="n">MaybeDone</span><span class="o">&lt;</span><span class="n">F</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">maybe_poll</span><span class="p">(</span><span class="n">pinned</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">cx</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">Context</span><span class="o">&lt;</span><span class="nb">&#39;_</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">MaybeDone</span>::<span class="n">Polling</span><span class="p">(</span><span class="n">fut</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//                    ---
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// This is in fact pin-projection, although
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// it&#39;s happening implicitly as part of pattern
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// matching. `fut` here has type `pinned &amp;mut F`.
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// We are permitted to do this pin-projection
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// to `F` because we know that `Self: !Unpin`
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// (because we declared that to be true).
</span></span></span><span class="line"><span class="cl"><span class="w">            
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">Poll</span>::<span class="n">Ready</span><span class="p">(</span><span class="n">res</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">fut</span><span class="p">.</span><span class="n">poll</span><span class="p">(</span><span class="n">cx</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="o">*</span><span class="bp">self</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">MaybeDone</span>::<span class="n">Done</span><span class="p">(</span><span class="nb">Some</span><span class="p">(</span><span class="n">res</span><span class="p">));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">is_done</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">bool</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="fm">matches!</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="o">&amp;</span><span class="n">MaybeDone</span>::<span class="n">Done</span><span class="p">(</span><span class="n">_</span><span class="p">))</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">take_output</span><span class="p">(</span><span class="n">pinned</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="n">F</span>::<span class="n">Output</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//         ----------------
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//     This method is called after pinning, so it
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//     needs a `pinned &amp;mut` reference...  
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">MaybeDone</span>::<span class="n">Done</span><span class="p">(</span><span class="n">res</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">res</span><span class="p">.</span><span class="n">value</span><span class="p">.</span><span class="n">take</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//  ------------
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//  ...but take is an `&amp;mut self` method
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//  and `F:Output: Unpin` is known to be true.
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//  
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//  Therefore we have made the type in `Done`
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//  be `Unpinnable`, so that we can do this
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//  swap.
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nb">None</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="can-you-translate-the-join-example">Can you translate the <code>Join</code> example?</h3>
<p>Yep! Here is <code>Join</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Join</span><span class="o">&lt;</span><span class="n">F1</span>: <span class="nc">Future</span><span class="p">,</span><span class="w"> </span><span class="n">F2</span>: <span class="nc">Future</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">fut1</span>: <span class="nc">MaybeDone</span><span class="o">&lt;</span><span class="n">F1</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">fut2</span>: <span class="nc">MaybeDone</span><span class="o">&lt;</span><span class="n">F2</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">F1</span>: <span class="nc">Future</span><span class="p">,</span><span class="w"> </span><span class="n">F2</span>: <span class="nc">Future</span><span class="o">&gt;</span><span class="w"> </span><span class="o">!</span><span class="nb">Unpin</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Join</span><span class="o">&lt;</span><span class="n">F</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">//                           ------------------
</span></span></span><span class="line"><span class="cl"><span class="c1">//
</span></span></span><span class="line"><span class="cl"><span class="c1">// Join is a custom future, so implement `!Unpin`
</span></span></span><span class="line"><span class="cl"><span class="c1">// to gain access to pin-projection.
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">F1</span>: <span class="nc">Future</span><span class="p">,</span><span class="w"> </span><span class="n">F2</span>: <span class="nc">Future</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Future</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Join</span><span class="o">&lt;</span><span class="n">F1</span><span class="p">,</span><span class="w"> </span><span class="n">F2</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="n">F1</span>::<span class="n">Output</span><span class="p">,</span><span class="w"> </span><span class="n">F2</span>::<span class="n">Output</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">poll</span><span class="p">(</span><span class="n">pinned</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">cx</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">Context</span><span class="o">&lt;</span><span class="nb">&#39;_</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Poll</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Output</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// The calls to `maybe_poll` and `take_output` below
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// are doing pin-projection from `pinned &amp;mut self`
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// to a `pinned &amp;mut MaybeDone&lt;F1&gt;` (or `F2`) type.
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// This is allowed because we opted out from `Unpin`
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// above.
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">fut1</span><span class="p">.</span><span class="n">maybe_poll</span><span class="p">(</span><span class="n">cx</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">fut2</span><span class="p">.</span><span class="n">maybe_poll</span><span class="p">(</span><span class="n">cx</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">if</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">fut1</span><span class="p">.</span><span class="n">is_done</span><span class="p">()</span><span class="w"> </span><span class="o">&amp;&amp;</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">fut2</span><span class="p">.</span><span class="n">is_done</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="kd">let</span><span class="w"> </span><span class="n">res1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">fut1</span><span class="p">.</span><span class="n">take_output</span><span class="p">().</span><span class="n">unwrap</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="kd">let</span><span class="w"> </span><span class="n">res2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">fut2</span><span class="p">.</span><span class="n">take_output</span><span class="p">().</span><span class="n">unwrap</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">Poll</span>::<span class="n">Ready</span><span class="p">((</span><span class="n">res1</span><span class="p">,</span><span class="w"> </span><span class="n">res2</span><span class="p">))</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">Poll</span>::<span class="n">Pending</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="whats-the-story-with-drop-and-why-does-it-matter">What&rsquo;s the story with <code>Drop</code> and why does it matter?</h3>
<p>Drop&rsquo;s current signature takes <code>&amp;mut self</code>. But <a href="#What-does-your-design-axiom-Pin-is-its-own-world-mean">recall that</a> once a <code>!Unpin</code> type is pinned, it is only safe to use <code>pinned &amp;mut</code>. This is a combustible combination. It means that, for example, I can write a <code>Drop</code> that uses <code>mem::replace</code> or swap to move values out from my fields, even though they have been pinned.</p>
<p>For types that are always <code>Unpin</code>, this is no problem, because <code>&amp;mut self</code> and <code>pinned &amp;mut self</code> are equivalent. For types that are always <code>!Unpin</code>, I&rsquo;m not too worried, because Drop as is is a poor fit for them, and <code>pinned &amp;mut self</code> will be beter.</p>
<p>The tricky bit is types that are <em>conditionally</em> <code>Unpin</code>. Consider something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">LogWrapper</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">value</span>: <span class="nc">T</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Drop</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">LogWrapper</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">drop</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>At least today, whether or not <code>LogWrapper</code> is <code>Unpin</code> depends on whether <code>T: Unpin</code>, so we can&rsquo;t know it for sure.</p>
<p>The solution that boats and I both landed on effectively creates three categories of types:<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup></p>
<ul>
<li>those that implement <code>Unpin</code>, which are <em>unpinnable</em>;</li>
<li>those that do not implement <code>Unpin</code> but which have <code>fn drop(&amp;mut self)</code>, which are <em>unsafely pinnable</em>;</li>
<li>those that do not implement <code>Unpin</code> and do not have <code>fn drop(&amp;mut self)</code>, which are <em>safely pinnable</em>.</li>
</ul>
<p>The idea is that using <code>fn drop(&amp;mut self)</code> puts you in this purgatory category of being &ldquo;unsafely pinnable&rdquo; (it might be more accurate to say being &ldquo;maybe unsafely pinnable&rdquo;, since often at compilation time with generics we won&rsquo;t know if there is an <code>Unpin</code> impl or not). You don&rsquo;t get access to safe pin projection or other goodies, but you can do projection with unsafe code (e.g., the way the <code>pin-project-lite</code> crate does it today).</p>
<h3 id="it-feels-weird-to-have-drop-let-you-use-mut-self-when-other-traits-dont">It feels weird to have <code>Drop</code> let you use <code>&amp;mut self</code> when other traits don&rsquo;t.</h3>
<p>Yes, it does, but in fact any method whose trait uses <code>pinned &amp;mut self</code> can be <em>implemented</em> safely with <code>&amp;mut self</code> so long as <code>Self: Unpin</code>. So we could just allow that in general. This would be cool because many hand-written futures are in fact <code>Unpin</code>, and so they could implement the <code>poll</code> method with <code>&amp;mut self</code>.</p>
<h3 id="wait-so-if-unpin-types-can-use-mut-self-why-do-we-need-special-rules-for-drop">Wait, so if <code>Unpin</code> types can use <code>&amp;mut self</code>, why do we need special rules for <code>Drop</code>?</h3>
<p>Well, it&rsquo;s true that an <code>Unpin</code> type can use <code>&amp;mut self</code> in place of <code>pinned &amp;mut self</code>, but in fact we don&rsquo;t always <em>know</em> when types are <code>Unpin</code>. Moreover, per the zero-conceptual-cost axiom, we don&rsquo;t want people to have to know anything about <code>Pin</code> to use <code>Drop</code>. The obvious approaches I could think of all either violated that axiom or just&hellip; well&hellip; seemed weird:</p>
<ul>
<li>Permit <code>fn drop(&amp;mut self)</code> but only if <code>Self: Unpin</code> seems like it would work, since most types are <code>Unpin</code>. But in fact types, by default, are only <code>Unpin</code> if their fields are <code>Unpin</code>, and so generic types are not <em>known</em> to be <code>Unpin</code>. This means that if you write a <code>Drop</code> impl for a generic type and you use <code>fn drop(&amp;mut self)</code>, you will get an error that can only be fixed by implementing <code>Unpin</code> unconditionally. Because &ldquo;pin is its own world&rdquo;, I believe adding the impl is fine, but it violates &ldquo;zero-conceptual-cost&rdquo; because it means that you are forced to understand what <code>Unpin</code> even means in the first place.</li>
<li>To address that, I considered treating <code>fn drop(&amp;mut self)</code> as implicitly declaring <code>Self: Unpin</code>. This doesn&rsquo;t violate our axioms but just seems <em>weird</em> and kind of surprising. It&rsquo;s also backwards incompatible with pin-project-lite.</li>
</ul>
<p>These considerations let me to conclude that actually the current design kind of puts in a place where we want three categories. I think in retrospect it&rsquo;d be better if <code>Unpin</code> were implemented by default but not as an auto trait (i.e., all types were unconditionally <code>Unpin</code> unless they declare otherwise), but oh well.</p>
<h3 id="what-is-the-forwards-compatibility-story-for-overwrite">What is the forwards compatibility story for <code>Overwrite</code>?</h3>
<p>I mentioned early on that MinPin could be seen as a first step that can later be extended with <code>Overwrite</code> if we choose. How would that work?</p>
<p>Basically, if we did the <code>s/Unpin/Overwrite/</code> change, then we would</p>
<ul>
<li>rename <code>Unpin</code> to <code>Overwrite</code> (literally rename, they would be the same trait);</li>
<li>prevent overwriting the referent of an <code>&amp;mut T</code> unless <code>T: Overwrite</code> (or replacing, swapping, etc).</li>
</ul>
<p>These changes mean that <code>&amp;mut T</code> is pin-preserving. If <code>T: !Overwrite</code>, then <code>T</code> may be pinned, but then <code>&amp;mut T</code> won&rsquo;t allow it to be overwritten, replaced, or swapped, and so pinning guarantees are preserved (and then some, since technically overwrites are ok, just not replacing or swapping). As a result, we can simplify the MinPin rules for pin-projection to the following:</p>
<blockquote>
<p>Given a reference <code>s: pinned &amp;mut S</code>, the rules for projection of the field <code>f</code> are as follows:</p>
<ul>
<li><code>&amp;mut</code> projection is allowed via <code>&amp;mut s.f</code>.</li>
<li><code>pinned &amp;mut</code> projection is allowed via <code>pinned &amp;mut s.f</code> if <code>S: !Unpin</code></li>
</ul>
</blockquote>
<h3 id="what-would-it-feel-like-if-we-adopted-overwrite">What would it <em>feel</em> like if we adopted <code>Overwrite</code>?</h3>
<p>We actually got a bit of a preview when we talked about <code>MaybeDone</code>. Remember how we had to introduce <code>Unpinnable</code> around the final value so that we could swap it out? If we adopted <code>Overwrite</code>, I think the TL;DR of how code would be different is that most any code that today uses <code>std::mem::replace</code> or <code>std::mem::swap</code> would probably wind up using an explicit <code>Unpinnable</code>-like wrapper. I&rsquo;ll cover this later.</p>
<p>This goes a bit to show what I meant about there being a certain amount of inherent complexity that we can choose to distibute: in MinPin, this pattern of wrapping &ldquo;swappable&rdquo; data is isolated to <code>pinned &amp;mut self</code> methods in <code>!Unpin</code> types. With <code>Overwrite</code>, it would be more widespread (but you would get more widespread benefits, as well).</p>
<h2 id="conclusion">Conclusion</h2>
<p>My conclusion is that this is a fascinating space to think about!<sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup> So fun.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Hat tip to Tyler Mandry and Eric Holk who discussed these ideas with me in detail.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>MinPin is the &ldquo;minimal&rdquo; proposal that I feel meets my desiderata; I think you could devise a <em>maximally minimal</em> proposal is even smaller if you truly wanted.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>It&rsquo;s worth noting that coercions and subtyping though only go so far. For example, <code>&amp;mut</code> can be coerced to <code>&amp;</code>, but we often need methods that return &ldquo;the same kind of reference they took in&rdquo;, which can&rsquo;t be managed with coercions. That&rsquo;s why you see things like <code>last</code> and <code>last_mut</code>.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a>&#160;<a href="#fnref1:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>I would say that the current complexity of pinning is, in no small part, due to <em>accidental complexity</em>, as demonstrated by the recent round of exploration, but Eric&rsquo;s wider point stands.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>Here I am talking about the category of a particular monomorphized type in a particular version of the crate. At that point, every type either implements <code>Unpin</code> or it doesn&rsquo;t. Note that at <em>compilation time</em> there is more grey area, as they can be types that may or may not be pinnable, etc.&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p>Also that I spent way too much time iterating on this post. JUST GONNA POST IT.&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/overwrite-trait" term="overwrite-trait" label="Overwrite trait"/></entry><entry><title type="html">The `Overwrite` trait and `Pin`</title><link href="https://smallcultfollowing.com/babysteps/blog/2024/10/14/overwrite-and-pin/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2024/10/14/overwrite-and-pin/</id><published>2024-10-14T00:00:00+00:00</published><updated>2024-10-14T15:12:38+00:00</updated><content type="html"><![CDATA[<p>In July, boats presented a compelling vision in their post <a href="https://without.boats/blog/pinned-places/">pinned places</a>. With the <code>Overwrite</code> trait that I introduced in my previous post, however, I think we can get somewhere even <em>more</em> compelling, albeit at the cost of a tricky transition. As I will argue in this post, the <code>Overwrite</code>  trait effectively becomes a better version of the existing <code>Unpin</code> trait, one that effects not only pinned references but also regular <code>&amp;mut</code> references. Through this it&rsquo;s able to make <code>Pin</code> fit much more seamlessly with the rest of Rust.</p>
<h2 id="just-show-me-the-dang-code">Just show me the dang code</h2>
<p>Before I dive into the details, let&rsquo;s start by reviewing a few examples to show you what we are aiming at (you can also skip to the <a href="#sotheres-a-lot-here-whats-the-key-takeaways">TL;DR</a>, in the FAQ).</p>
<p>I&rsquo;m assuming a few changes here:</p>
<ul>
<li>Adding an <code>Overwrite</code> trait and changing most types to be <code>!Overwrite</code> by default.
<ul>
<li>The <code>Option&lt;T&gt;</code> (and maybe others) would opt-in to <code>Overwrite</code>, permitting <code>x.take()</code>.</li>
</ul>
</li>
<li>Integrating pin into the borrow checker, extending auto-ref to also &ldquo;auto-pin&rdquo; and produce a <code>Pin&lt;&amp;mut T&gt;</code>. The borrow checker only permits you to pin values that you own. Once a place has been pinned, you are not permitted to move out from it anymore (unless the value is overwritten).</li>
</ul>
<p>The first change is &ldquo;mildly&rdquo; backwards incompatible. I&rsquo;m not going to worry about that in this post, but I&rsquo;ll cover the ways I think we can make the transition in a follow up post.</p>
<a name="example-1" />
<h3 id="example-1-converting-a-generator-into-an-iterator">Example 1: Converting a generator into an iterator</h3>
<p>We would really like to add a <em>generator</em> syntax that lets you write an iterator more conveniently.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> For example, given some slice <code>strings: &amp;[String]</code>, we should be able to define a generator that iterates over the string lengths like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">do_computation</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="kt">usize</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">hashes</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">gen</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">strings</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="nb">String</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">compute_input_strings</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">for</span><span class="w"> </span><span class="n">string</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="o">&amp;</span><span class="n">strings</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="kr">yield</span><span class="w"> </span><span class="n">compute_hash</span><span class="p">(</span><span class="o">&amp;</span><span class="n">string</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// ...
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>But there is a catch here! To permit the borrow of <code>strings</code>, which is owned by the generator, the generator will have to be pinned.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> That means that generators cannot directly implement <code>Iterator</code>, because generators need a <code>Pin&lt;&amp;mut Self&gt;</code> signature for their <code>next</code> methods. It <em>is</em> possible, however, to implement <code>Iterator</code> for <code>Pin&lt;&amp;mut G&gt;</code> where <code>G</code> is a generator.<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></p>
<p>In today&rsquo;s Rust, that means that using a generator as an iterator would require explicit pinning:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">do_computation</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="kt">usize</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">hashes</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">gen</span><span class="w"> </span><span class="p">{</span><span class="o">....</span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">hashes</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">pin!</span><span class="p">(</span><span class="n">hashes</span><span class="p">);</span><span class="w"> </span><span class="c1">// &lt;-- explicit pin
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">h</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">hashes</span><span class="p">.</span><span class="n">next</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// process first hash
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// ...
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>With <a href="https://without.boats/blog/pinned-places/">pinned places</a>, this feels more builtin, but it still requires users to actively think about pinning for even the most basic use case:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">do_computation</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="kt">usize</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">hashes</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">gen</span><span class="w"> </span><span class="p">{</span><span class="o">....</span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">pinned</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">hashes</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">hashes</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">h</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">hashes</span><span class="p">.</span><span class="n">next</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// process first hash
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// ...
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Under this proposal, users would simply be able to ignore pinning altogether:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">do_computation</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="kt">usize</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">hashes</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">gen</span><span class="w"> </span><span class="p">{</span><span class="o">....</span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">h</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">hashes</span><span class="p">.</span><span class="n">next</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// process first hash
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// ...
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Pinning is still happening: once a user has called <code>next</code>, they would not be able to move <code>hashes</code> after that point. If they tried to do so, the borrow checker (which now understands pinning natively) would give an error like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">error</span><span class="p">[</span><span class="n">E0596</span><span class="p">]</span>: <span class="nc">cannot</span><span class="w"> </span><span class="n">borrow</span><span class="w"> </span><span class="err">`</span><span class="n">hashes</span><span class="err">`</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="n">mutable</span><span class="p">,</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="n">it</span><span class="w"> </span><span class="n">is</span><span class="w"> </span><span class="n">not</span><span class="w"> </span><span class="n">declared</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="n">mutable</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w"> </span><span class="o">-</span>-&gt; <span class="nc">src</span><span class="o">/</span><span class="n">lib</span><span class="p">.</span><span class="n">rs</span>:<span class="mi">4</span>:<span class="mi">22</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="o">|</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="mi">4</span><span class="w"> </span><span class="o">|</span><span class="w">     </span><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">h</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">hashes</span><span class="p">.</span><span class="n">next</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="o">|</span><span class="w">                      </span><span class="o">------</span><span class="w"> </span><span class="n">value</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="err">`</span><span class="n">hashes</span><span class="err">`</span><span class="w"> </span><span class="n">was</span><span class="w"> </span><span class="n">pinned</span><span class="w"> </span><span class="n">here</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="o">|</span><span class="w">     </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="mi">7</span><span class="w"> </span><span class="o">|</span><span class="w">     </span><span class="n">move_somewhere_else</span><span class="p">(</span><span class="n">hashes</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="o">|</span><span class="w">                         </span><span class="o">^^^^^^</span><span class="w"> </span><span class="n">cannot</span><span class="w"> </span><span class="k">move</span><span class="w"> </span><span class="n">a</span><span class="w"> </span><span class="n">pinned</span><span class="w"> </span><span class="n">value</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">help</span>: <span class="nc">if</span><span class="w"> </span><span class="n">you</span><span class="w"> </span><span class="n">want</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="k">move</span><span class="w"> </span><span class="err">`</span><span class="n">hashes</span><span class="err">`</span><span class="p">,</span><span class="w"> </span><span class="n">consider</span><span class="w"> </span><span class="n">using</span><span class="w"> </span><span class="err">`</span><span class="nb">Box</span>::<span class="n">pin</span><span class="err">`</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="n">allocate</span><span class="w"> </span><span class="n">a</span><span class="w"> </span><span class="n">pinned</span><span class="w"> </span><span class="k">box</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="o">|</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="mi">3</span><span class="w"> </span><span class="o">|</span><span class="w">     </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">hashes</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Box</span>::<span class="n">pin</span><span class="p">(</span><span class="n">gen</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">....</span><span class="w"> </span><span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="o">|</span><span class="w">                      </span><span class="o">+++++++++</span><span class="w">            </span><span class="o">+</span><span class="w">
</span></span></span></code></pre></div><p>As noted, it is possible to move <code>hashes</code> after pinning, but only if you pin it into a heap-allocated box. So we can advise users how to do that.</p>
<a name="example-2" />
<h3 id="example-2-implementing-the-maybedone-future">Example 2: Implementing the <code>MaybeDone</code> future</h3>
<p>The <a href="https://without.boats/blog/pinned-places/">pinned places</a> post included an example future called <code>MaybeDone</code>. I&rsquo;m going to implement that same future in the system I describe here. There are some comments in the example comparing it to the <a href="https://without.boats/blog/pinned-places/#bringing-it-together">version from the pinned places post</a>.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">enum</span> <span class="nc">MaybeDone</span><span class="o">&lt;</span><span class="n">F</span>: <span class="nc">Future</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//         ---------
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//         I&#39;m assuming we are in Rust.Next, and so the default
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//         bounds for `F` do not include `Overwrite`.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//         In other words, `F: ?Overwrite` is the default
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//         (just as it is with every other trait besides `Sized`).
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Polling</span><span class="p">(</span><span class="n">F</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//      -
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//      We don&#39;t need to declare `pinned F`.
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Done</span><span class="p">(</span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">F</span>::<span class="n">Output</span><span class="o">&gt;</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">F</span>: <span class="nc">Future</span><span class="o">&gt;</span><span class="w"> </span><span class="n">MaybeDone</span><span class="o">&lt;</span><span class="n">F</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">maybe_poll</span><span class="p">(</span><span class="bp">self</span>: <span class="nc">Pin</span><span class="o">&lt;&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">Self</span><span class="o">&gt;</span><span class="p">,</span><span class="w"> </span><span class="n">cx</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">Context</span><span class="o">&lt;</span><span class="nb">&#39;_</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//        --------------------
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//        I&#39;m not bothering with the `&amp;pinned mut self`
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//        sugar here, though certainly we could still
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//        add it.
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">MaybeDone</span>::<span class="n">Polling</span><span class="p">(</span><span class="n">fut</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//                    ---
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//       Just as in the original example,
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//       we are able to project from `Pin&lt;&amp;mut Self&gt;`
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//       to a `Pin&lt;&amp;mut F&gt;`.
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//       The key is that we can safely project
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//       from an owner of type `Pin&lt;&amp;mut Self&gt;`
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//       to its field of type `Pin&lt;&amp;mut F&gt;`
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//       so long as the owner type `Self: !Overwrite`
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//       (which is the default for structs in Rust.Next).
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">Poll</span>::<span class="n">Ready</span><span class="p">(</span><span class="n">res</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">fut</span><span class="p">.</span><span class="n">poll</span><span class="p">(</span><span class="n">cx</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="o">*</span><span class="bp">self</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">MaybeDone</span>::<span class="n">Done</span><span class="p">(</span><span class="nb">Some</span><span class="p">(</span><span class="n">res</span><span class="p">));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">is_done</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">bool</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="fm">matches!</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="o">&amp;</span><span class="n">MaybeDone</span>::<span class="n">Done</span><span class="p">(</span><span class="n">_</span><span class="p">))</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">take_output</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="n">F</span>::<span class="n">Output</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//         ---------
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//   In pinned places, this method had to be
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//   `&amp;pinned mut self`, but under this design,
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//   it can be a regular `&amp;mut self`.
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//   
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//   That&#39;s because `Pin&lt;&amp;mut Self&gt;` becomes
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//   a subtype of `&amp;mut Self`.
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">MaybeDone</span>::<span class="n">Done</span><span class="p">(</span><span class="n">res</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">res</span><span class="p">.</span><span class="n">take</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nb">None</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><a name="example-3" />
<h3 id="example-3-implementing-the-join-combinator">Example 3: Implementing the <code>Join</code> combinator</h3>
<p>Let&rsquo;s complete the journey by implementing a <code>Join</code> future:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Join</span><span class="o">&lt;</span><span class="n">F1</span>: <span class="nc">Future</span><span class="p">,</span><span class="w"> </span><span class="n">F2</span>: <span class="nc">Future</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// These fields do not have to be declared `pinned`:
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">fut1</span>: <span class="nc">MaybeDone</span><span class="o">&lt;</span><span class="n">F1</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">fut2</span>: <span class="nc">MaybeDone</span><span class="o">&lt;</span><span class="n">F2</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">F1</span><span class="p">,</span><span class="w"> </span><span class="n">F2</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Future</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Join</span><span class="o">&lt;</span><span class="n">F1</span><span class="p">,</span><span class="w"> </span><span class="n">F2</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">F1</span>: <span class="nc">Future</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">F2</span>: <span class="nc">Future</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="n">F1</span>::<span class="n">Output</span><span class="p">,</span><span class="w"> </span><span class="n">F2</span>::<span class="n">Output</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">poll</span><span class="p">(</span><span class="bp">self</span>: <span class="nc">Pin</span><span class="o">&lt;&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">Self</span><span class="o">&gt;</span><span class="p">,</span><span class="w"> </span><span class="n">cx</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">Context</span><span class="o">&lt;</span><span class="nb">&#39;_</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Poll</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Output</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//  --------------------
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// Again, I&#39;ve dropped the sugar here.
</span></span></span><span class="line"><span class="cl"><span class="w">        
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// This looks just the same as in the
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// &#34;Pinned Places&#34; example. This again
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// leans on the ability to project
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// from a `Pin&lt;&amp;mut Self&gt;` owner so long as
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// `Self: !Overwrite` (the default for structs
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// in Rust.Next).
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">fut1</span><span class="p">.</span><span class="n">maybe_poll</span><span class="p">(</span><span class="n">cx</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">fut2</span><span class="p">.</span><span class="n">maybe_poll</span><span class="p">(</span><span class="n">cx</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">if</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">fut1</span><span class="p">.</span><span class="n">is_done</span><span class="p">()</span><span class="w"> </span><span class="o">&amp;&amp;</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">fut2</span><span class="p">.</span><span class="n">is_done</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// This code looks the same as it did with pinned places,
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// but there is an important difference. `take_output`
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// is now an `&amp;mut self` method, not a `Pin&lt;&amp;mut Self&gt;`
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// method. This demonstrates that we can also get
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// a regular `&amp;mut` reference to our fields.
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="kd">let</span><span class="w"> </span><span class="n">res1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">fut1</span><span class="p">.</span><span class="n">take_output</span><span class="p">().</span><span class="n">unwrap</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="kd">let</span><span class="w"> </span><span class="n">res2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">fut2</span><span class="p">.</span><span class="n">take_output</span><span class="p">().</span><span class="n">unwrap</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">Poll</span>::<span class="n">Ready</span><span class="p">((</span><span class="n">res1</span><span class="p">,</span><span class="w"> </span><span class="n">res2</span><span class="p">))</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">Poll</span>::<span class="n">Pending</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="how-i-think-about-pin">How I think about pin</h2>
<p>OK, now that I&rsquo;ve lured you in with code examples, let me drive you away by diving into the details of <code>Pin</code>. I&rsquo;m going to cover the way that I think about <code>Pin</code>. It is similar to but different from how <code>Pin</code> is presented in the <a href="https://without.boats/blog/pinned-places/">pinned places</a> post &ndash; in particular, I prefer to think about <em>places that pin their values</em> and not <em>pinned places</em>. In any case, <code>Pin</code> is surprisingly subtle, and I recommend that if you want to go deeper, you read boat&rsquo;s <a href="https://without.boats/blog/pin/">history of <code>Pin</code> post</a> and/or <a href="https://doc.rust-lang.org/std/pin/struct.Pin.html">the stdlib documentation for <code>Pin</code></a>.</p>
<h3 id="the-pinp--type-is-a-modifier-on-the-pointer-p">The <code>Pin&lt;P&gt;</code>  type is a modifier on the pointer <code>P</code></h3>
<p>The <code>Pin&lt;P&gt;</code> type is unusual in Rust. It <strong>looks</strong> similar to a &ldquo;smart pointer&rdquo; type, like <code>Arc&lt;T&gt;</code>, but it functions differently. <code>Pin&lt;P&gt;</code> is not a pointer, it is a <strong>modifier</strong> on another pointer, so</p>
<ul>
<li>a <code>Pin&lt;&amp;T&gt;</code> represents a <strong>pinned reference,</strong></li>
<li>a <code>Pin&lt;&amp;mut T&gt;</code> represents a <strong>pinned mutable reference,</strong></li>
<li>a <code>Pin&lt;Box&lt;T&gt;&gt;</code> represents a <strong>pinned box,</strong></li>
</ul>
<p>and so forth.</p>
<p>You can think of a <code>Pin&lt;P&gt;</code> type as being a pointer of type <code>P</code> that refers to a <strong>place</strong> (Rust jargon for a location in memory that stores a value) <strong>whose value <code>v</code> has been pinned</strong>. A pinned value <code>v</code> can never be moved to another place in memory. Moreover, <code>v</code> must be dropped before its place can be reassigned to another value.</p>
<h3 id="pinning-is-part-of-the-lifecycle-of-a-place">Pinning is part of the &ldquo;lifecycle&rdquo; of a place</h3>
<p>The way I think about, every place in memory has a lifecycle:</p>
<pre class="mermaid">flowchart TD
Uninitialized 
Initialized
Pinned

Uninitialized --
    p = v where v: T
--> Initialized

Initialized -- 
    move out, drop, or forget
--> Uninitialized

Initialized --
    pin value v in p
    (only possible when T is !Unpin)
--> Pinned

Pinned --
    drop value
--> Uninitialized

Pinned --
    move out or forget
--> UB

Uninitialized --
    free the place
--> Freed

UB[💥 Undefined behavior 💥]
  </pre>
<p>When first allocated, a place <code>p</code> is <strong>uninitialized</strong> &ndash; that is, <code>p</code> has no value at all.</p>
<p>An uninitialized place can be <strong>freed</strong>. This corresponds to e.g. popping a stack frame or invoking <code>free</code>.</p>
<p><code>p</code> may at some point become <strong>initialized</strong> by an assignment like <code>p = v</code>. At that point, there are three ways to transition back to uninitialized:</p>
<ul>
<li>The value <code>v</code> could be moved somewhere else, e.g. by moving it somewhere else, like <code>let p2 = p</code>. At that point, <code>p</code> goes back to being uninitialized.</li>
<li>The value <code>v</code> can be <em>forgotten</em>, with <code>std::mem::forget(p)</code>. At this point, no destructor runs, but <code>p</code> goes back to being considered uninitialized.</li>
<li>The value <code>v</code> can be <em>dropped</em>, which occurs when the place <code>p</code> goes out of scope. At this point, the destructor runs, and <code>p</code> goes back to being considered uninitialized.</li>
</ul>
<p>Alternatively, the value <code>v</code> can be <strong>pinned in place</strong>:</p>
<ul>
<li>At this point, <code>v</code> cannot be moved again, and the only way for <code>p</code> to be reused is for <code>v</code> to be dropped.</li>
</ul>
<p>Once a value is pinned, moving or forgetting the value is <strong>not</strong> allowed. These actions are &ldquo;undefined behavior&rdquo;, and safe Rust must not permit them to occur.</p>
<h4 id="a-digression-on-forgetting-vs-other-ways-to-leak">A digression on forgetting vs other ways to leak</h4>
<p>As most folks know, Rust does not guarantee that destructors run. If you have a value <code>v</code> whose destructor never runs, we say that value is <em>leaked</em>. There are however two ways to leak a value, and they are quite different in their impact:</p>
<ul>
<li>Option A: Forgetting. Using <code>std::mem::forget</code>, you can <em>forget</em> the value <code>v</code>. The place <code>p</code> that was storing that value will go from <em>initialized</em> to <em>uninitialized</em>, at which point the place <code>p</code> can be freed.
<ul>
<li>Forgetting a value is <strong>undefined behavior</strong> if that value has been pinned, however!</li>
</ul>
</li>
<li>Option B: Leak the place. When you leak a place, it just stays in the initialized or pinned state forever, so its value is never dropped. This can happen, for example, with a ref-count cycle.
<ul>
<li>This is safe even if the value is pinned!</li>
</ul>
</li>
</ul>
<p>In retrospect, I wish that Option A did not exist &ndash; I wish that we had not added <code>std::mem::forget</code>. We did so as part of working through the impact of ref-count cycles. It seemed equivalent at the time (&ldquo;the dtor doesn&rsquo;t run anyway, why not make it easy to do&rdquo;) but I think this diagram shows why it adding forget made things permanently more complicated for relatively little gain.<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup> Oh well! Can&rsquo;t win &rsquo;em all.</p>
<h3 id="values-of-types-implementing-unpin-cannot-be-pinned">Values of types implementing <code>Unpin</code> cannot be pinned</h3>
<p>There is one subtle aspect here: not all values can be pinned. If a type <code>T</code> implements <code>Unpin</code>, then values of type <code>T</code> cannot be pinned. When you have a pinned reference to them, they can still squirm out from under you via <code>swap</code> or other techniques. Another way to say the same thing is to say that <em>values can only be pinned if their type is <code>!Unpin</code></em> (&ldquo;does not implement <code>Unpin</code>&rdquo;).</p>
<p>Types that are <code>!Unpin</code> can be called <em>address sensitive</em>, meaning that once they pinned, there can be pointers to the internals of that value that will be invalidated if the address changes. Types that implement <code>Unpin</code> would therefore be <em>address insensitive</em>. Traditionally, all Rust types have been address insensitive, and therefore <code>Unpin</code> is an auto trait, implemented by most types by default.</p>
<h3 id="pinmut-t-is-really-a-maybe-pinned-reference"><code>Pin&lt;&amp;mut T&gt;</code> is really a &ldquo;maybe pinned&rdquo; reference</h3>
<p>Looking at the state machine as I describe it here, we can see that possessing a <code>Pin&lt;&amp;mut T&gt;</code> isn&rsquo;t really a <em>pinned</em> mutable reference, in the sense that it doesn&rsquo;t always refer to a place that is pinning its value. If <code>T: Unpin</code>, then it&rsquo;s just a regular reference. But if <code>T: !Unpin</code>, then a pinned reference guarantees that the value it refers to is pinned in place.</p>
<p>This fits with the name <code>Unpin</code>, which I believe was meant to convey that idea that, even if you have a pinned reference to a value of type <code>T: Unpin</code>, that value can become unpinned. I&rsquo;ve heard the metaphor of &ldquo;if <code>T: Unpin</code>, you can left out the pin, swap in a different value, and put the pin back&rdquo;.</p>
<h2 id="pin-picked-a-peck-of-pickled-pain">Pin picked a peck of pickled pain</h2>
<p>Everyone agrees that <code>Pin</code> is confusing and a pain to use. But what makes it such a pain?</p>
<p>If you are attempting to <strong>author</strong> a Pin-based API, there are two primary problems:</p>
<ol>
<li><code>Pin&lt;&amp;mut Self&gt;</code> methods can&rsquo;t make use of regular <code>&amp;mut self</code> methods.</li>
<li><code>Pin&lt;&amp;mut Self&gt;</code> methods can&rsquo;t access fields by default. Crates like <a href="https://crates.io/crates/pin-project-lite">pin-project-lite</a> make this easier but still require learning obscure concepts like <a href="https://doc.rust-lang.org/std/pin/index.html#projections-and-structural-pinning">structural pinning</a>.</li>
</ol>
<p>If you attempting to <strong>consume</strong> a Pin-based API, the primary annoyance is that getting a pinned reference is hard. You can&rsquo;t just call <code>Pin&lt;&amp;mut Self&gt;</code> methods normally, you have to remember to use <code>Box::pin</code> or <code>pin!</code> first. (We saw this in <a href="#example-1">Example 1</a> from this post.)</p>
<h2 id="my-proposal-in-a-nutshell">My proposal in a nutshell</h2>
<p>This post is focused on a proposal with two parts:</p>
<ol>
<li>Making <code>Pin</code>-based APIs easier to <em>author</em> by replacing the <code>Unpin</code> trait with <code>Overwrite</code>.</li>
<li>Making <code>Pin</code>-based APIs easier to <em>call</em> by integrating pinning into the borrow checker.</li>
</ol>
<p>I&rsquo;m going to walk through those in turn.</p>
<h2 id="making-pin-based-apis-easier-to-author">Making <code>Pin</code>-based APIs easier to author</h2>
<h3 id="overwrite-as-the-better-unpin"><code>Overwrite</code> as the better <code>Unpin</code></h3>
<p>The first part of my proposalis a change I call <code>s/Unpin/Overwrite/</code>. The idea is to introduce <code>Overwrite</code> and then change the &ldquo;place lifecycle&rdquo; to reference <code>Overwrite</code> instead of <code>Unpin</code>:</p>
<pre class="mermaid">flowchart TD
Uninitialized 
Initialized
Pinned

Uninitialized --
    p = v where v: T
--> Initialized

Initialized -- 
    move out, drop, or forget
--> Uninitialized

Initialized --
    pin value v in p
    (only possible when<br>T is 👉<b><i>!Overwrite</i></b>👈)
--> Pinned

Pinned --
    drop value
--> Uninitialized

Pinned --
    move out or forget
--> UB

Uninitialized --
    free the place
--> Freed

UB[💥 Undefined behavior 💥]
  </pre>
<p>For <code>s/Unpin/Overwrite/</code> to work well, we have to make all <code>!Unpin</code> types also be <code>!Overwrite</code>. This is not, strictly speaking, backwards compatible, since today <code>!Unpin</code> types (like all types) can be overwritten and swapped. I think eventually we want <em>every</em> type to be <code>!Overwrite</code> by default, but I don&rsquo;t think we can change that default in a general way without an edition. But for <code>!Unpin</code> types <em>in particular</em> I suspect we can get away with it, because <code>!Unpin</code> types are pretty rare, and the simplification we get from doing so is pretty large. (And, as I argued in the previous post, <a href="https://smallcultfollowing.com/babysteps/blog/2024/09/26/overwrite-trait/#subtle-overwrite-is-not-infectious">there is no loss of expressiveness</a>; code today that overwrites or swaps <code>!Unpin</code> values can be locally rewritten.)</p>
<h3 id="why-swaps-are-bad-without-sunpinoverwrite">Why swaps are bad without <code>s/Unpin/Overwrite/</code></h3>
<p>Today, <code>Pin&lt;&amp;mut T&gt;</code> cannot be converted into an <code>&amp;mut T</code> reference unless <code>T: Unpin</code>.<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup> This because it would allow safe Rust code to create Undefined Behavior by swapping the referent of the <code>&amp;mut T</code> reference and hence moving the pinned value. By requiring that <code>T: Unpin</code>, the <code>DerefMut</code> impl is effectively limiting itself to references that are not, in fact, in the &ldquo;pinned&rdquo; state, but just in the &ldquo;initialized&rdquo; state.</p>
<h4 id="as-a-result-pinmut-t-and-mut-t-methods-dont-interoperate-today">As a result, <code>Pin&lt;&amp;mut T&gt;</code> and <code>&amp;mut T</code> methods don&rsquo;t interoperate today</h4>
<p>This leads directly to our first two pain points. To start, from a <code>Pin&lt;&amp;mut Self&gt;</code> method, you can only invoke <code>&amp;self</code> methods (via the <code>Deref</code> impl) or other <code>Pin&lt;&amp;mut Self&gt;</code> methods. This schism separates out the &ldquo;regular&rdquo; methods of a type from its pinned methods; it also means that methods doing field assignments don&rsquo;t compile:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">increment_field</span><span class="p">(</span><span class="bp">self</span>: <span class="nc">Pin</span><span class="o">&lt;&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">Self</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">self</span><span class="p">.</span><span class="n">field</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">field</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This errors because compiling a field assignment requires a <code>DerefMut</code> impl and <code>Pin&lt;&amp;mut Self&gt;</code> doesn&rsquo;t have one.</p>
<h4 id="with-sunpinoverwrite-pinmut-self-is-a-subtype-of-mut-self">With <code>s/Unpin/Overwrite/</code>, <code>Pin&lt;&amp;mut Self&gt;</code> is a subtype of <code>&amp;mut self</code></h4>
<p><code>s/Unpin/Overwrite/</code> allows us to implement <code>DerefMut</code> for <em>all</em> pinned types. This is because, unlike <code>Unpin</code>, <code>Overwrite</code> affects how <code>&amp;mut</code> works, and hence <code>&amp;mut T</code> would preserve the pinned state for the place it references. Consider the two possibilities for the value of type <code>T</code> referred to by the <code>&amp;mut T</code>:</p>
<ul>
<li>If <code>T: Overwrite</code>, then the value is not pinnable, and so the place cannot be in the pinned state.</li>
<li>If <code>T: !Overwrite</code>, the value could be pinned, but we also cannot overwrite or swap it, and so pinning is preserved.</li>
</ul>
<p>This implies that <code>Pin&lt;&amp;mut T&gt;</code> is in fact a generalized version of <code>&amp;mut T</code>. Every <code>&amp;'a mut T</code> keeps the value pinned for the duration of its lifetime <code>'a</code>, but a <code>Pin&lt;&amp;mut T&gt;</code> ensures the value stays pinned for the lifetime of the underlying storage.</p>
<p>If we have a <code>DerefMut</code> impl, then <code>Pin&lt;&amp;mut Self&gt;</code> methods can freely call <code>&amp;mut self</code> methods. Big win!</p>
<h4 id="today-you-must-categorize-fields-as-structurally-pinned-or-not">Today you must categorize fields as &ldquo;structurally pinned&rdquo; or not</h4>
<p>The other pain point today with <code>Pin</code> is that we have no native support for &ldquo;pin projection&rdquo;<sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup>. That is, you cannot safely go from a <code>Pin&lt;&amp;mut Self&gt;</code> reference to a <code>Pin&lt;&amp;mut F&gt;</code> method that referring to some field <code>self.f</code> without relying on unsafe code.</p>
<p>The most common practice today is to use a custom crate like <a href="https://crates.io/crates/pin-project-lite">pin-project-lite</a>. Even then, you also have to make a choice for each field between whether you want to be able to get a <code>Pin&lt;&amp;mut F&gt;</code> reference or a normal <code>&amp;mut F</code> reference. Fields for which you can get a pinned reference are called <a href="https://doc.rust-lang.org/std/pin/index.html#projections-and-structural-pinning">structurally pinned</a> and the criteria for which one you should use is rather subtle. Ultimately this choice is required because <code>Pin&lt;&amp;mut F&gt;</code> and <code>&amp;mut F</code> don&rsquo;t play nicely together.</p>
<h4 id="pin-projection-is-safe-from-any-overwrite-type">Pin projection is safe from any <code>!Overwrite</code> type</h4>
<p>With <code>s/Unpin/Overwrite/</code>, we can scrap the idea of structural pinning. Instead, if we have a field owner <code>self: Pin&lt;&amp;mut Self&gt;</code>, pinned projection is allowed so long as <code>Self: !Overwrite</code>. That is, if <code>Self: !Overwrite</code>, then I can <em>always</em> get a <code>Pin&lt;&amp;mut F&gt;</code> reference to some field <code>self.f</code> of type <code>F</code>. How is that possible?</p>
<p>Actually, the full explanation relies on borrow checker extensions I haven&rsquo;t introduced yet. But let&rsquo;s see how far we get without them, so that we can see the gap that the borrow checker has to close.</p>
<p>Assume we are creating a <code>Pin&lt;&amp;'a mut F&gt;</code> reference <code>r</code> to some field <code>self.f</code>, where <code>self: Pin&lt;&amp;mut Self&gt;</code>:</p>
<ul>
<li>We are creating a <code>Pin&lt;&amp;'a mut F&gt;</code> reference to the value in <code>self.f</code>:
<ul>
<li>If <code>F: Overwrite</code>, then the value is not pinnable, so this is equivalent to an ordinary <code>&amp;mut F</code> and we have nothing to prove.</li>
<li>Else, if <code>F: !Overwrite</code>, then we have to show that the value in <code>self.f</code> will not move for the remainder of its lifetime.
<ul>
<li>Pin projection from ``*self<code>is only valid if</code>Self: !Overwrite<code>and</code>self: Pin&lt;&amp;&lsquo;b mut Self&gt;<code>, so we know that the value in </code>*self` is pinned for the remainder of its lifetime by induction.</li>
<li>We have to show then that the value <code>v_f</code> in <code>self.f</code> will never be moved until the end of its lifetime.</li>
</ul>
</li>
</ul>
</li>
</ul>
<p>There are three ways to move a value out of <code>self.f</code>:</p>
<ul>
<li>You can assign a new value to <code>self.f</code>, like <code>self.f = ...</code>.
<ul>
<li>This will run the destructor, ending the lifetime of the value <code>v_f</code>.</li>
</ul>
</li>
<li>You can create a mutable reference <code>r = &amp;mut self.f</code> and then&hellip;
<ul>
<li>assign a new value to <code>*r</code>: but that will be an error because <code>F: !Overwrite</code>.</li>
<li>swap the value in <code>*r</code> with another: but that will be an error because <code>F: !Overwrite</code>.</li>
</ul>
</li>
</ul>
<p>QED. =)</p>
<h2 id="making-pin-based-apis-easier-to-call">Making <code>Pin</code>-based APIs easier to call</h2>
<p>Today, getting a <code>Pin&lt;&amp;mut&gt;</code> requires using the <code>pin!</code> macro, going through <code>Box::pin</code>, or some similar explicit action. This adds &ldquo;syntactic salt&rdquo; to calling a <code>Pin&lt;&amp;mut Self&gt;</code> some other abstraction rooted in unsafe (e.g., <code>Box::pin</code>). There is no built-in way to safely create a pinned reference. This is fine but introduces ergonomic hurdles</p>
<p>We want to make calling a <code>Pin&lt;&amp;mut Self&gt;</code> method as easy as calling an <code>&amp;mut self</code> method. To do this, we need to extra the compiler&rsquo;s notion of &ldquo;auto-ref&rdquo; to include the option of &ldquo;auto-pin-ref&rdquo;:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// Instead of this:
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">future</span>: <span class="nc">Pin</span><span class="o">&lt;&amp;</span><span class="k">mut</span><span class="w"> </span><span class="k">impl</span><span class="w"> </span><span class="n">Future</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">pin!</span><span class="p">(</span><span class="k">async</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">future</span><span class="p">.</span><span class="n">poll</span><span class="p">(</span><span class="n">cx</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// We would do this:
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">future</span>: <span class="nc">impl</span><span class="w"> </span><span class="n">Future</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">async</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">future</span><span class="p">.</span><span class="n">poll</span><span class="p">(</span><span class="n">cx</span><span class="p">);</span><span class="w"> </span><span class="c1">// &lt;-- Wowee!
</span></span></span></code></pre></div><p>Just as a typical method call like <code>vec.len()</code> expands to <code>Vec::len(&amp;vec)</code>, the compiler would be expanding <code>future.poll(cx)</code> to something like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">Future</span>::<span class="n">poll</span><span class="p">(</span><span class="o">&amp;</span><span class="n">pinned</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">future</span><span class="p">,</span><span class="w"> </span><span class="n">cx</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">//           ^^^^^^^^^^^ but what, what&#39;s this?
</span></span></span></code></pre></div><p>This expansion though includes a new piece of syntax that doesn&rsquo;t exist today, the <code>&amp;pinned mut</code> operation. (I&rsquo;m lifting this syntax from boats&rsquo; <a href="https://without.boats/blog/pinned-places/">pinned places</a> proposal.)</p>
<p>Whereas <code>&amp;mut var</code> results in an <code>&amp;mut T</code> reference (assuming <code>var: T</code>), <code>&amp;pinned mut var</code> borrow would result in a <code>Pin&lt;&amp;mut T&gt;</code>. It would also make the borrow checker consider the value in <code>future</code> to be <em>pinned</em>. That means that it is illegal to move out from <code>var</code>. The pinned state continues indefinitely until <code>var</code> goes out of scope or is overwritten by an assignment like <code>var = ...</code> (which drops the heretofore pinned value). This is a fairly straightforward extension to the borrow checker&rsquo;s existing logic.</p>
<h3 id="new-syntax-not-strictly-required">New syntax not strictly required</h3>
<p>It&rsquo;s worth noting that we don&rsquo;t actually <strong>need</strong> the <code>&amp;pinned mut</code> syntax (which means we don&rsquo;t need the <code>pinned</code> keyword). We could make it so that the only way to get the compiler to do a pinned borrow is via auto-ref. We could even add a silly trait to make it explicit, like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Pinned</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">pinned</span><span class="p">(</span><span class="bp">self</span>: <span class="nc">Pin</span><span class="o">&lt;&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">Self</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Pin</span><span class="o">&lt;&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">Self</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="o">?</span><span class="nb">Sized</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Pinned</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">pinned</span><span class="p">(</span><span class="bp">self</span>: <span class="nc">Pin</span><span class="o">&lt;&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Pin</span><span class="o">&lt;&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now you can write <code>var.pinned()</code>, which the compiler would desugar to <code>Pinned::pinned(&amp;rustc#pinned mut var)</code>. Here I am using <code>rustc#pinned</code> to denote an &ldquo;internal keyword&rdquo; that users can&rsquo;t type.<sup id="fnref:7"><a href="#fn:7" class="footnote-ref" role="doc-noteref">7</a></sup></p>
<h2 id="frequently-asked-questions">Frequently asked questions</h2>
<h3 id="sotheres-a-lot-here-whats-the-key-takeaways">So&hellip;there&rsquo;s a lot here. What&rsquo;s the key takeaways?</h3>
<p>The shortest version of this post I can manage is<sup id="fnref:8"><a href="#fn:8" class="footnote-ref" role="doc-noteref">8</a></sup></p>
<ul>
<li>Pinning fits smoothly into Rust if we make two changes:
<ul>
<li>Limit the ability to swap types by default, making <code>Pin&lt;&amp;mut T&gt;</code> a subtype of <code>&amp;mut T</code> and enabling uniform pin projection.</li>
<li>Integrate pinning in the auto-ref rules and the borrow checker.</li>
</ul>
</li>
</ul>
<h3 id="why-do-you-only-mention-swaps-doesnt-overwrite-affect-other-things">Why do you only mention swaps? Doesn&rsquo;t <code>Overwrite</code> affect other things?</h3>
<p>Indeed the <code>Overwrite</code> trait as I defined it is overkill for pinning. The more precise, we might imagine two special traits that affect how and when we can drop or move values:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">DropWhileBorrowed</span>: <span class="nb">Sized</span> <span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Swap</span>: <span class="nc">DropWhileBorrowed</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><ul>
<li>Given a reference <code>r: &amp;mut T</code>, overwriting its referent <code>*r</code> with a new value would require <code>T: DropWhileBorrowed</code>;</li>
<li>Swapping two values of type <code>T</code> requires that <code>T: Swap</code>.
<ul>
<li>This is true regardless of whether they are borrowed or not.</li>
</ul>
</li>
</ul>
<p>Today, every type is <code>Swap</code>. What I argued in the previous post is that we should make the default be that user-defined types implement <strong>neither</strong> of these two traits (over an edition, etc etc). Instead, you could opt-in to both of them at once by implementing <code>Overwrite</code>.</p>
<p>But we could get all the pin benefits by making a weaker change. Instead of having types opt out from both traits by default, they could only opt out of <code>Swap</code>, but continue to implement <code>DropWhileBorrowed</code>. This is enough to make pinning work smoothly. To see why, recall the <a href="#pinning-is-part-of-the-lifecycle-of-a-place">pinning state diagram</a>: dropping the value in <code>*r</code> (permitted by <code>DropWhileBorrowed</code>) will exit the &ldquo;pinned&rdquo; state and return to the &ldquo;uninitialized&rdquo; state. This is valid. Swapping, in contrast, is UB.</p>
<p>Two subtle observations here worth calling out:</p>
<ol>
<li>Both <code>DropWhileBorrowed</code> and <code>Swap</code> have <code>Sized</code> as a supertrait. Today in Rust you can&rsquo;t drop a <code>&amp;mut dyn SomeTrait</code> value and replace it with another, for example. I think it&rsquo;s a bit unclear whether unsafe could do this if it knows the dynamic type of value behind the <code>dyn</code>. But under this model, it would only be valid for unsafe code do that drop if (a) it knew the dynamic type and (b) the dynamic type implemented <code>DropWhileBorrowed</code>. Same applies to <code>Swap</code>.</li>
<li>The <code>Swap</code> trait applies longer than just the duration of a borrow. This is because, once you pin a value to create a <code>Pin&lt;&amp;mut T&gt;</code> reference, the state of being pinned persists even after that reference has ended. I say a bit more about this in <a href="#theres-a-lot-of-subtle-reasoning-in-this-post-are-you-sure-this-is-correct">another FAQ below</a>.</li>
</ol>
<p>EDIT: An earlier draft of this post named the trait <code>Swap</code>. This was wrong, as described in the FAQ on <a href="#theres-a-lot-of-subtle-reasoning-in-this-post-are-you-sure-this-is-correct">subtle reasoning</a>.</p>
<h3 id="why-then-did-you-propose-opting-out-from-both-overwrites-and-swaps">Why then did you propose opting out from both overwrites <em>and</em> swaps?</h3>
<p>Opting out of overwrites (i.e., making the default be <em>neither</em> <code>DropWhileBorrowed</code> <em>nor</em> <code>Swap</code>) gives us the additional benefit of truly immutable fields. This will make cross-function borrows less of an issue, as I described in my previous post, and make some other things (e.g., variance) less relevant. Moreover, I don&rsquo;t think overwriting an entire reference like <code>*r</code> is that common, versus accessing individual fields. And in the cases where people <em>do</em> do it, it is <a href="https://smallcultfollowing.com/babysteps/blog/2024/09/26/overwrite-trait/#subtle-overwrite-is-not-infectious">easy to make a dummy struct with a single field, and then overwrite <code>r.value</code> instead of <code>*r</code></a>. To me, therefore, distinguishing between <code>DropWhileBorrowed</code> and <code>Swap</code> doesn&rsquo;t obviously carry its weight.</p>
<h3 id="can-you-come-up-with-a-more-semantic-name-for-overwrite">Can you come up with a more <em>semantic</em> name for <code>Overwrite</code>?</h3>
<p>All the trait names I&rsquo;ve given so far (<code>Overwrite</code>, <code>DropWhileBorrowed</code>, <code>Swap</code>) answer the question of &ldquo;what operation does this trait allow&rdquo;. That&rsquo;s pretty common for traits (e.g., <code>Clone</code> or, for that matter, <code>Unpin</code>) but it is sometimes useful to think instead about &ldquo;what kinds of types should implement this trait&rdquo; (or not implement it, as the case may be).</p>
<p>My current favorite &ldquo;semantic style name&rdquo; is <code>Mobile</code>, which corresponds to implementing <code>Swap</code>. A <em>mobile</em> type is one that, while borrowed, can move to a new place. This name doesn&rsquo;t convey that it&rsquo;s also ok to <em>drop</em> the value, but that follows, since if you can swap the value to a new place, you can presumably drop that new place.</p>
<p>I don&rsquo;t have a &ldquo;semantic&rdquo; name for <code>DropWhileBorrowed</code>. As I said, I&rsquo;m hard pressed to characterize the type that would want to implement <code>DropWhileBorrowed</code> but not <code>Swap</code>.</p>
<h3 id="what-do-dropwhileborrowed-and-swap-have-in-common">What do <code>DropWhileBorrowed</code> and <code>Swap</code> have in common?</h3>
<p>These traits pertain to whether an owner who lends out a local variable (i.e., executes <code>r = &amp;mut lv</code>) can rely on that local variable <code>lv</code> to store the same value after the borrow completes. Under this model, the answer depends on the type <code>T</code> of the local variable:</p>
<ul>
<li>If <code>T: DropWhileBorrowed</code> (or <code>T: Swap</code>, which implies <code>DropWhileBorrowed</code>), the answer is &ldquo;no&rdquo;, the local variable may point at some other value, because it is possible to do <code>*r = /* new value */</code>.</li>
<li>But if <code>T: !DropWhileBorrowed</code>, then the owner can be sure that <code>lv</code> still stores the same value (though <code>lv</code>&rsquo;s fields may have changed).</li>
</ul>
<p>Let&rsquo;s use an analogy. Suppose I own a house and I lease it out to someone else to use. I expect that they will make changes on the inside, such as hanging up a new picture. But I don&rsquo;t expect them to tear down the house and build a new one on the same lot. I also don&rsquo;t expect them to drive up a flatbed truck, load my house onto it, and move it somewhere else (while proving me with a new one in return). In Rust today, a reference <code>r: &amp;mut T</code> reference allows all of these things:</p>
<ul>
<li>Mutating a field like <code>r.count += 1</code> corresponds to <em>hanging up a picture</em>. The values inside <code>r</code> change, but <code>r</code> still refers to the same conceptual value.</li>
<li>Overwriting <code>*r = t</code> with a new value <code>t</code> is like tearing down the house and building a new one. The original value that was in <code>r</code> no longer exists.</li>
<li>Swapping <code>*r</code> with some other reference <code>*r2</code> is like moving my house somewhere else and putting a new house in its place.</li>
</ul>
<p>EDIT: Wording refined based on feedback.</p>
<h3 id="what-does-it-mean-to-be-the-same-value">What does it mean to be the &ldquo;same value&rdquo;?</h3>
<p>One question I received was what it meant for two structs to have the &ldquo;same value&rdquo;? Imagine a struct with all public fields &ndash; can we make any sense of it having an <em>identity</em>? The way I think of it, every struct has a &ldquo;ghost&rdquo; private field <code>$identity</code> (one that doesn&rsquo;t exist at runtime) that contains its identity. Every <code>StructName { }</code> expression has an implicit <code>$identity: new_value()</code> that assigns the identity a distinct value from every other struct that has been created thus far. If two struct values have the same <code>$identity</code>, then they are the same value.</p>
<p>Admittedly, if a struct has all public fields, then it doesn&rsquo;t really matter whether it&rsquo;s identity is the same, except <a href="https://en.wikipedia.org/wiki/Ship_of_Theseus">perhaps to philosophers</a>. But most structs don&rsquo;t.</p>
<p>An example that can help clarify this is what I call the &ldquo;scope pattern&rdquo;. Imagine I have a <code>Scope</code> type that has some private fields and which can be &ldquo;installed&rdquo; in some way and later &ldquo;deinstalled&rdquo; (perhaps it modifies thread-local values):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Scope</span><span class="w"> </span><span class="p">{</span><span class="o">..</span><span class="p">.}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Scope</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">new</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">Self</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="cm">/* install scope */</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="nb">Drop</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Scope</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">drop</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="cm">/* deinstall scope */</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>And the only way for users to get their hands on a &ldquo;scope&rdquo; is to use <code>with_scope</code>, which ensures it is installed and deinstalled properly:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">with_scope</span><span class="p">(</span><span class="n">op</span>: <span class="nc">impl</span><span class="w"> </span><span class="nb">FnOnce</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">Scope</span><span class="p">))</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">scope</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Scope</span>::<span class="n">new</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">op</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">scope</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>It may appear that this code enforces a &ldquo;stack discipline&rdquo;, where nested scopes will be installed and deinstalled in a stack-like fashion. But in fact, thanks to <code>std::mem::swap</code>, this is not guaranteed:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">with_scope</span><span class="p">(</span><span class="o">|</span><span class="n">s1</span><span class="o">|</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">with_scope</span><span class="p">(</span><span class="o">|</span><span class="n">s2</span><span class="o">|</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">std</span>::<span class="n">mem</span>::<span class="n">swap</span><span class="p">(</span><span class="n">s1</span><span class="p">,</span><span class="w"> </span><span class="n">s2</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">})</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">})</span><span class="w">
</span></span></span></code></pre></div><p>This could easily cause logic bugs or, in unsafe is involved, something worse. This is why lending out scopes requires some extra step to be safe, such as using a <code>&amp;</code>-reference or adding a &ldquo;fresh&rdquo; lifetime paramteer of some kind to ensure that each scope has a unique type. In principle you could also use a type like <code>&amp;mut dyn ScopeTrait</code>, because the compiler disallows overwriting or swapping <code>dyn Trait</code> values: but I think it&rsquo;s ambiguous today whether unsafe code could validly do such a swap.</p>
<p>EDIT: Question added based on feedback.</p>
<h3 id="theres-a-lot-of-subtle-reasoning-in-this-post-are-you-sure-this-is-correct">There&rsquo;s a lot of subtle reasoning in this post. Are you sure this is correct?</h3>
<p>I am pretty sure! But not 100%. I&rsquo;m definitely scared that people will point out some obvious flaw in my reasoning. But of course, if there&rsquo;s a flaw I want to know. To help people analyze, let me recap the two subtle arguments that I made in this post and recap the reasoning.</p>
<p><strong>Lemma.</strong> Given some local variable <code>lv: T</code> where <code>T: !Overwrite</code> mutably borrowed by a reference <code>r: &amp;'a mut T</code>, the value in <code>lv</code> cannot be dropped, moved, or forgotten for the lifetime <code>'a</code>.</p>
<p>During <code>'a</code>, the variable <code>lv</code> cannot be accessed directly (per the borrow checker&rsquo;s usual rules). Therefore, any drops/moves/forgets must take place to <code>*r</code>:</p>
<ul>
<li>Because <code>T: !Overwrite</code>, it is not possible to overwrite or swap <code>*r</code> with a new value; it is only legal to mutate individual fields. Therefore the value cannot be dropped or moved.</li>
<li>Forgetting a value (via <code>std::mem::forget</code>) requires ownership and is not accesible while <code>lv</code> is borrowed.</li>
</ul>
<p><strong>Theorem A.</strong> If we replace <code>T: Unpin</code> and <code>T: Overwrite</code>, then <code>Pin&lt;&amp;mut T&gt;</code> is a safe subtype of <code>&amp;mut T</code>.</p>
<p>The argument proceeds by cases:</p>
<ul>
<li>If <code>T: Overwrite</code>, then <code>Pin&lt;&amp;mut T&gt;</code> does not refer to a pinned value, and hence it is semantically equivalent to <code>&amp;mut T</code>.</li>
<li>If <code>T: !Overwrite</code>, then <code>Pin&lt;&amp;mut T&gt;</code> does refer to a pinned value, so we must show that the pinning guarantee cannot be disturbed by the <code>&amp;mut T</code>. By our lemma, the <code>&amp;mut T</code> cannot move or forget the pinned value, which is the only way to disturb the pinning guarantee.</li>
</ul>
<p><strong>Theorem B.</strong> Given some field owner <code>o: O</code> where <code>O: !Overwrite</code> with a field <code>f: F</code>, it is safe to pin-project from <code>Pin&lt;&amp;mut O&gt;</code> to a <code>Pin&lt;&amp;mut F&gt;</code> reference referring to <code>o.f</code>.</p>
<p>The argument proceeds by cases:</p>
<ul>
<li>If <code>F: Overwrite</code>, then <code>Pin&lt;&amp;mut F&gt;</code> is equivalent to <code>&amp;mut F</code>. We showed in Theorem A that <code>Pin&lt;&amp;mut O&gt;</code> could be upcast to <code>&amp;mut O</code> and it is possible to create an <code>&amp;mut F</code> from <code>&amp;mut O</code>, so this must be safe.</li>
<li>If <code>F: !Overwrite</code>, then <code>Pin&lt;&amp;mut F&gt;</code> refers to a pinned value found in <code>o.f</code>. The lemma tells us that the value in <code>o.f</code> will not be disturbed for the duration of the borrow.</li>
</ul>
<p>EDIT: It was pointed out to me that this last theorem isn&rsquo;t quite proving what it needs to prove. It shows that <code>o.f</code> will not be disturbed for the duration of the borrow, but to meet the pin rules, we need to ensure that the value is not swapped even after the borrow ends. We can do this by committing to never permit swaps of values unless <code>T: Overwrite</code>, regardless of whether they are borrowed. I meant to clarify this in the post but forgot about it, and then I made a mistake and talked about <code>Swap</code> &ndash; but <code>Swap</code> is the right name.</p>
<h3 id="what-part-of-this-post-are-you-most-proud-of">What part of this post are you most proud of?</h3>
<p>Geez, I&rsquo;m <em>so</em> glad you asked! Such a thoughtful question. To be honest, the part of this post that I am happiest with is the state diagram for places, which I&rsquo;ve found very useful in helping me to understand <code>Pin</code>:</p>
<pre class="mermaid">flowchart TD
Uninitialized 
Initialized
Pinned

Uninitialized --
    `p = v` where `v: T`
--> Initialized

Initialized -- 
    move out, drop, or forget
--> Uninitialized

Initialized --
    pin value `v` in `p`
    (only possible when `T` is `!Unpin`)
--> Pinned

Pinned --
    drop value
--> Uninitialized

Pinned --
    move out or forget
--> UB

Uninitialized --
    free the place
--> Freed

UB[💥 Undefined behavior 💥]
  </pre>
<p>Obviously this question was just an excuse to reproduce it again. Some of the key insights that it helped me to crystallize:</p>
<ul>
<li>A value that is <code>Unpin</code> cannot be pinned:
<ul>
<li>And hence <code>Pin&lt;&amp;mut Self&gt;</code> really means &ldquo;reference to a maybe-pinned value&rdquo; (a value that is <em>pinned if it can be</em>).</li>
</ul>
</li>
<li>Forgetting a value is very different from leaking the place that value is stored:
<ul>
<li>In both cases, the value&rsquo;s <code>Drop</code> never runs, but only one of them can lead to a &ldquo;freed place&rdquo;.</li>
</ul>
</li>
</ul>
<p>In thinking through the stuff I wrote in this post, I&rsquo;ve found it very useful to go back to this diagram and trace through it with my finger.</p>
<h3 id="is-this-backwards-compatible">Is this backwards compatible?</h3>
<p>Maybe? The question does not have a simple answer. I will address in a future blog post in this series. Let me say a few points here though:</p>
<p>First, the <code>s/Unpin/Overwrite/</code> proposal is not backwards compatible as I described. It would mean for example that all futures returned by <code>async fn</code> are no longer <code>Overwrite</code>. It is quite possible we simply can&rsquo;t get away with it.</p>
<p>That&rsquo;s not fatal, but it makes things more annoying. It would mean there exist types that are <code>!Unpin</code> but which can be overwritten. This in turn means that <code>Pin&lt;&amp;mut Self&gt;</code> is not a subtype of <code>&amp;mut Self</code> for <em>all</em> types. Pinned mutable references would be a subtype for <em>almost</em> all types, but not those that are <code>!Unpin &amp;&amp; Overwrite</code>.</p>
<p>Second, a naive, conservative transition would definitely be rough. My current thinking is that, in older editions, we add <code>T: Overwrite</code> bounds by default on type parameters <code>T</code> and, when you have a <code>T: SomeTrait</code> bound, we would expand that to include a <code>Overwrite</code> bound on associated types in <code>SomeTrait</code>, like <code>T: SomeTrait&lt;AssocType: Overwrite&gt;</code>. When you move to a newer edition I think we would just <strong>not</strong> add those bounds. This is kind of a mess, though, because if you call code from an older edition, you are still going to need those bounds to be present.</p>
<p>That all sounds painful enough that I think we might have to do something smarter, where we don&rsquo;t <em>always</em> add <code>Overwrite</code> bounds, but instead use some kind of inference in older editions to avoid it most of the time.</p>
<h1 id="conclusion">Conclusion</h1>
<p>My takeaway from authoring this post is that something like <code>Overwrite</code> has the potential to turn <code>Pin</code> from wizard level Rust into mere &ldquo;advanced Rust&rdquo;, somewhat akin to knowing the borrow checker really well. If we had no backwards compatibility constraints to work with, it seems clear that this would be a better design than <code>Unpin</code> as it is today.</p>
<p>Of course, we <em>do</em> have backwards compatibility constraints, so the real question is how we can make the transition. I don&rsquo;t know the answer yet! I&rsquo;m planning on thinking more deeply about it (and talking to folks) once this post is out. My hope was first to make the case for the value of <code>Overwrite</code> (and to be sure my reasoning is sound) before I invest too much into thinking how we can make the transition.</p>
<p>Assuming we can make the transition, I&rsquo;m wondering two things. First, is <code>Overwrite</code> the right name? Second, should we take the time to re-evaluate the default bounds on generic types in a more complete way? For example, to truly have a nice async story, and for myraid other reasons, I think we need <a href="https://smallcultfollowing.com/babysteps/blog/2023/03/16/must-move-types/">must move types</a>. How does that fit in?</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>The precise design of generators is of course an ongoing topic of some controversy. I am not trying to flesh out a true design here or take a position. Mostly I want to show that we can create ergonomic bridges between &ldquo;must pin&rdquo; types like generators and &ldquo;non pin&rdquo; interfaces like <code>Iterator</code> in an ergonomic way without explicit mentioning of pinning.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Boats has argued that, since no existing iterator can support borrows over a yield point, generators might not need to do so either. I don&rsquo;t agree. I think supporting borrows over yield points is necessary for ergonomics <a href="https://aturon.github.io/tech/2018/04/24/async-borrowing/">just as it was in futures</a>.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Actually for <code>Pin&lt;impl DerefMut&lt;Target: Generator&gt;&gt;</code>.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>I will say, I use <code>std::mem::forget</code> quite regularly, but mostly to make up for a shortcoming in <code>Drop</code>. I would like it if <code>Drop</code> had a separate method, <code>fn drop_on_unwind(&amp;mut self)</code>, and we invoked that method when unwinding. Most of the time, it would be the same as regular drop, but in some cases it&rsquo;s useful to have cleanup logic that only runs in the case of unwinding.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>In contrast, a <code>Pin&lt;&amp;mut T&gt;</code> reference can be safely converted into an <code>&amp;T</code> reference, as evidenced by <a href="https://doc.rust-lang.org/std/pin/struct.Pin.html#impl-Deref-for-Pin%3CPtr%3E">Pin&rsquo;s <code>Deref</code> impl</a>. This is because, even if <code>T: !Unpin</code>, a <code>&amp;T</code> reference cannot do anything that is invalid for a pinned value. You can&rsquo;t swap the underlying value or read from it.&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p>Projection is the wonky PL term for &ldquo;accessing a field&rdquo;. It&rsquo;s never made much sense to me, but I don&rsquo;t have a better term to use, so I&rsquo;m sticking with it.&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:7">
<p>We have a syntax <code>k#foo</code> for explicitly referred to a keyword <code>foo</code>. It is meant to be used only for keywords that will be added in future Rust editions. However, I sometimes think it&rsquo;d be neat to internal-ish keywords (like <code>k#pinned</code>) that are used in desugaring but rarely need to be typed explicitly; you would still be <em>able</em> to write <code>k#pinned</code> if for whatever reason you <em>wanted</em> to. And of course we could later opt to stabilize it as <code>pinned</code> (no prefix required) in a future edition.&#160;<a href="#fnref:7" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:8">
<p>I tried asking ChatGPT to summarize the post but, when I pasted in my post, it replied, &ldquo;The message you submitted was too long, please reload the conversation and submit something shorter.&rdquo; Dang ChatGPT, that&rsquo;s rude! Gemini at least <a href="https://g.co/gemini/share/bdc1e35d4805">gave it the old college try</a>. Score one for Google. Plus, it called my post &ldquo;thought-provoking!&rdquo; Aww, I&rsquo;m blushing!&#160;<a href="#fnref:8" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/overwrite-trait" term="overwrite-trait" label="Overwrite trait"/></entry><entry><title type="html">Making overwrite opt-in #crazyideas</title><link href="https://smallcultfollowing.com/babysteps/blog/2024/09/26/overwrite-trait/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2024/09/26/overwrite-trait/</id><published>2024-09-26T00:00:00+00:00</published><updated>2024-09-26T21:51:55+00:00</updated><content type="html"><![CDATA[<p>What would you say if I told you that it was possible to (a) eliminate a lot of “inter-method borrow conflicts” <em>without</em> introducing something like <a href="https://smallcultfollowing.com/babysteps/blog/2021/11/05/view-types/">view types</a> and (b) make pinning easier even than boats’s <a href="https://without.boats/blog/pinned-places/">pinned places</a> proposal, all without needing pinned fields or even a pinned keyword? You’d probably say “Sounds great… what’s the catch?” The catch it requires us to change Rust’s fundamental assumption that, given <code>x: &amp;mut T</code>, you can always overwrite <code>*x</code> by doing <code>*x = /* new value */</code>, for any type <code>T: Sized</code>. This kind of change is tricky, but not impossible, to do over an edition.</p>
<h2 id="tldr">TL;DR</h2>
<p>We can reduce inter-procedural borrow check errors, increase clarity, and make pin vastly simpler to work with if we limit when it is possible to overwrite an <code>&amp;mut</code> reference. The idea is that if you have a mutable reference <code>x: &amp;mut T</code>, it should only be possible to overwrite <code>x</code> via <code>*x = /* new value */</code> or to swap its value via <code>std::mem::swap</code> if <code>T: Overwrite</code>. To start with, most structs and enums would implement <code>Overwrite</code>, and it would be a default bound, like <code>Sized</code>; but we would transition in a future edition to have structs/enums be <code>!Overwrite</code> by default and to have <code>T: Overwrite</code> bounds written explicitly.</p>
<h2 id="structure-of-this-series">Structure of this series</h2>
<p>This blog post is part of a series:</p>
<ol>
<li>This first post will introduce the idea of immutable fields and show why they could make Rust more ergonomic and more consistent. It will then show how overwrites and swaps are the key blocker and introduce the idea of the <code>Overwrite</code> trait, which could overcome that.</li>
<li>In the next post, I&rsquo;ll dive deeper into <code>Pin</code> and how the <code>Overwrite</code> trait can help there.</li>
<li>After that, who knows? Depends on what people say in response.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></li>
</ol>
<h2 id="if-you-could-change-one-thing-about-rust-what-would-it-be">If you could change one thing about Rust, what would it be?</h2>
<p>People often ask me to name something I would change about Rust if I could. One of the items on my list is the fact that, given a mutable reference <code>x: &amp;mut SomeStruct</code> to some struct, I can overwrite the entire value of <code>x</code> by doing <code>*x = /* new value */</code>, versus only modifying individual fields like <code>x.field = /* new value */</code>.</p>
<p>Having the ability to overwrite <code>*x</code> always seemed very natural to me, having come from C, and it’s definitely useful sometimes (particularly with <code>Copy</code> types like integers or newtyped integers). But it turns out to make borrowing and pinning much more painful than they would otherwise have to be, as I’ll explain shortly.</p>
<p>In the past, when I&rsquo;ve thought about how to fix this, I always assumed we would need a new form of reference type, like <code>&amp;move T</code> or something. That seemed like a non-starter to me. But at RustConf last week, while talking about the ergonomics of <code>Pin</code>, a few of us stumbled on the idea of using a <em>trait</em> instead. Under this design, you can always make an <code>x: &amp;mut T</code>, but you can’t always assign to <code>*x</code> as a result. This turns out to be a much smoother integration. And, as I’ll show, it doesn’t really give up any expressiveness.</p>
<h2 id="motivating-example-1-immutable-fields">Motivating example #1: Immutable fields</h2>
<p>In this post, I’m going to motivate the changes by talking about <strong>immutable fields</strong>. Today in Rust, when you declare a local variable <code>let x = …</code>, that variable is immutable by default<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>. Fields, in contrast, inherit their mutability from the outside: when a struct appears in a <code>mut</code> location, all of its fields are mutable.</p>
<h3 id="not-all-fields-are-mutable-but-i-cant-declare-that-in-my-rust-code">Not all fields are mutable, but I can’t declare that in my Rust code</h3>
<p>It turns out that declaring local variables as mut is <a href="https://smallcultfollowing.com/babysteps/blog/2014/05/13/focusing-on-ownership/">not needed for the borrow checker</a> — and yet we do it nonetheless, in part because it helps readability. It&rsquo;s useful to see when a variable might change. But if that argument holds for local variables, it holds double for fields! For local variables, we can find all potential mutation just by searching one function. To know if a <em>field</em> may be mutated, we have to search across many functions. And for fields, precisely because they can be mutated across functions, declaring them as immutable can actually help the borrow checker to see that your code is safe.</p>
<h3 id="idea-declare-fields-as-mutable">Idea: Declare fields as mutable</h3>
<p>So what if we extended the mutable declaration to fields? The idea would be that, in your struct, if you want to mutate fields, you have to declare them as <code>mut</code>. This would allow them to be mutated: but only if the struct itself appears in a mutable local field.</p>
<p>For example, maybe I have an <code>Analyzer</code> struct that is created with some vector of datums and which has to compute the number of “important” ones:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cp">#[derive(Default)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Analyzer</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="sd">/// Data being analyzed: will never be modified.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">Datum</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="sd">/// Number of important datums uncovered so far.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">mut</span><span class="w"> </span><span class="n">important</span>: <span class="kt">usize</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>As you can see from the struct declaration, the field <code>data</code> is declared as immutable. This is because we are only going to be reading the <code>Datum</code> values. The <code>important</code>
field is declared as <code>mut</code>, indicating that it will be updated.</p>
<h3 id="when-can-you-mutate-fields">When can you mutate fields?</h3>
<p>In this world, mutating a field is only possible when (1) the struct appears in a mutable location and (2) the field you are referencing is declared as <code>mut</code>. So this code compiles fine, because the field <code>important</code> is <code>mut</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">analyzer</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Analyzer</span>::<span class="n">new</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">analyzer</span><span class="p">.</span><span class="n">important</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w"> </span><span class="c1">// OK: mut field in a mut location
</span></span></span></code></pre></div><p>But this code does not compile, because the local variable <code>x</code> is not:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Analyzer</span>::<span class="n">default</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">x</span><span class="p">.</span><span class="n">important</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w"> </span><span class="c1">// ERROR: `x` not declared as mutable
</span></span></span></code></pre></div><p>And this code does not compile, because the field <code>data</code> is not declared as <code>mut</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Analyzer</span>::<span class="n">default</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">x</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="n">clear</span><span class="p">();</span><span class="w"> </span><span class="c1">// ERROR: field `data` is not declared as mutable
</span></span></span></code></pre></div><h3 id="leveraging-immutable-fields-in-the-borrow-checker">Leveraging immutable fields in the borrow checker</h3>
<p>So why is it useful to declare fields as <code>mut</code>? Well, imagine you have a method like <code>increment_if_important</code>, which checks if  <code>datum.is_important()</code> is true and modifies the <code>important</code> flag if so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Analyzer</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">increment_if_important</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">datum</span>: <span class="kp">&amp;</span><span class="nc">Datum</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">if</span><span class="w"> </span><span class="n">datum</span><span class="p">.</span><span class="n">is_important</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="bp">self</span><span class="p">.</span><span class="n">important</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now imagine you have a function that loops over <code>self.data</code> and calls <code>increment_if_important</code> on each item:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Analyzer</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">count_important</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">for</span><span class="w"> </span><span class="n">datum</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="o">&amp;</span><span class="bp">self</span><span class="p">.</span><span class="n">data</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="bp">self</span><span class="p">.</span><span class="n">increment_if_important</span><span class="p">(</span><span class="n">datum</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>I can hear the experienced Rustaceans crying out in pain now. This function, natural as it appears, will not compile in Rust today. Why is that? Well, we have a shared borrow on <code>self.data</code> but we are trying to call an <code>&amp;mut self</code> function, so we have no way to be sure that <code>self.data</code> will not be modified.</p>
<h3 id="but-what-about-immutable-fields-doesnt-that-solve-this">But what about immutable fields? Doesn’t that solve this?</h3>
<p>Annoyingly, immutable fields on their own don’t change anything! Why? Well, just because you can’t write to a field directly doesn’t mean you can’t mutate the memory it’s stored in. For example, maybe I write a malicious version of <code>increment_if_important</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Analyzer</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">malicious_increment_if_important</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">datum</span>: <span class="kp">&amp;</span><span class="nc">Datum</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">*</span><span class="bp">self</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Analyzer</span>::<span class="n">default</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This version never directly accesses the field <code>data</code>, but it just writes to <code>*self</code>, and hence it has the same impact. Annoying!</p>
<h3 id="generics-why-we-cant-trivially-disallow-overwrites">Generics: why we can’t trivially disallow overwrites</h3>
<p>Maybe you’re thinking “well, can’t we just disallow overwriting <code>*self</code> if there are fields declared <code>mut</code>?” The answer is yes, we can, and that’s what this blog post is about. But it’s not so simple as it sounds, because we are changing the “basic contract” that all Rust types currently satisfy. In particular, Rust today assumes that if you have a reference  <code>x: &amp;mut T</code> and a value <code>v: T</code>, you can always do <code>*x = v</code> and overwrite the referent of <code>x</code>. That means I could can write a generic function like <code>set_to_default</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">set_to_default</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Default</span><span class="o">&gt;</span><span class="p">(</span><span class="n">r</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">*</span><span class="n">r</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">T</span>::<span class="n">default</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now, since <code>Analyzer</code> implements <code>Default</code>, I can make <code>increment_if_important</code> call <code>set_to_default</code>. This will still free <code>self.data</code>, but it does it in a sneaky way, where we can’t obviously tell that the value being overwritten is an instance of a struct with mut fields:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Analyzer</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">malicious_increment_if_important</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">datum</span>: <span class="kp">&amp;</span><span class="nc">Datum</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// Overwrites `self.data`, but not in an obvious way
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">set_to_default</span><span class="p">(</span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="recap">Recap</h2>
<p>So let’s step back and recap what we’ve seen so far:</p>
<ul>
<li>If we could distinguish which fields were mutable and which were definitely not, we could eliminate many inter-function borrow check errors<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>.</li>
<li>However, just adding <code>mut</code> declarations is not enough, because fields can also be mutated indirectly. Specifically, when you have a <code>&amp;mut SomeStruct</code>, you can overwrite with a fresh instance of <code>SomeStruct</code> or swap with another <code>&amp;mut SomeStruct</code>, thus changing all fields at once.</li>
<li>Whatever fix we use has to consider generic code like <code>std::mem::swap</code>, which mutates an <code>&amp;mut T</code> without knowing precisely what <code>T</code> is. Therefore we can’t do something simple like looking to see if <code>T</code> is a struct with <code>mut</code> fields<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup>.</li>
</ul>
<h2 id="the-trait-system-to-the-rescue">The trait system to the rescue</h2>
<p>My proposal is to introduce a new, built-in marker trait called <code>Overwrite</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="sd">/// Marker trait that permits overwriting
</span></span></span><span class="line"><span class="cl"><span class="sd">/// the referent of an `&amp;mut Self` reference.
</span></span></span><span class="line"><span class="cl"><span class="cp">#[marker]</span><span class="w"> </span><span class="c1">// &lt;-- means the trait cannot have methods
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Overwrite</span>: <span class="nb">Sized</span> <span class="p">{}</span><span class="w">
</span></span></span></code></pre></div><h3 id="the-effect-of-overwrite">The effect of <code>Overwrite</code></h3>
<p>As a marker trait, <code>Overwrite</code> does not have methods, but rather indicates a property of the type. Specifically, assigning to a borrowed place of type <code>T</code> requires that <code>T: Overwrite</code> is implemented. For example, the following code writes to <code>*x</code>, which has type <code>T</code>; this is only legal if <code>T: Overwrite</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">overwrite</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="n">x</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">T</span><span class="p">,</span><span class="w"> </span><span class="n">t</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">*</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">t</span><span class="p">;</span><span class="w"> </span><span class="c1">// &lt;— requires `T: Overwrite`
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Given this this code compiles today, this implies that a generic type parameter declaration like <code>&lt;T&gt;</code> would require a default <code>Overwrite</code> bound in the current edition. We would want to phase these defaults out in some future edition, as I&rsquo;ll describe in detail later on.</p>
<p>Similarly, the standard library’s swap function would require a <code>T: Overwrite</code> bound, since it (via unsafe code) assigns to <code>*x</code> and <code>*y</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">swap</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="n">x</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">T</span><span class="p">,</span><span class="w"> </span><span class="n">y</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">tmp</span>: <span class="nc">T</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">std</span>::<span class="n">ptr</span>::<span class="n">read</span><span class="p">(</span><span class="n">x</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">std</span>::<span class="n">ptr</span>::<span class="n">write</span><span class="p">(</span><span class="o">*</span><span class="n">x</span><span class="p">,</span><span class="w"> </span><span class="o">*</span><span class="n">y</span><span class="p">);</span><span class="w"> </span><span class="c1">// overwrites `*x`, `T: Overwrite` required
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">std</span>::<span class="n">ptr</span>::<span class="n">write</span><span class="p">(</span><span class="o">*</span><span class="n">y</span><span class="p">,</span><span class="w"> </span><span class="n">tmp</span><span class="p">);</span><span class="w"> </span><span class="c1">// overwrites `*y`, `T: Overwrite` required
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="overwrite-requires-sized"><code>Overwrite</code> requires <code>Sized</code></h3>
<p>The <code>Overwrite</code> trait requires <code>Sized</code> because, for <code>*x = /* new value */</code> to be safe, the compiler needs to ensure that the place <code>*x</code> has enough space to store “new value”, and that is only possible when the size of the new value is known at compilation time (i.e., the type implements <code>Sized</code>).</p>
<h3 id="overwrite-only-applies-to-borrowed-values"><code>Overwrite</code> only applies to borrowed values</h3>
<p>The overwrite trait is only needed when assigning to a borrowed place of type <code>T</code>. If that place is owned, the owner is allowed to reassign it, just as they are allowed to drop it. So e.g. the following code compiles whether or not <code>SomeType: Overwrite</code> holds:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">x</span>: <span class="nc">SomeType</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="cm">/* something */</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="cm">/* something else */</span><span class="p">;</span><span class="w"> </span><span class="c1">// &lt;— does not require that `SomeType: Overwrite` holds
</span></span></span></code></pre></div><h3 id="subtle-overwrite-is-not-infectious">Subtle: <code>Overwrite</code> is not infectious</h3>
<p>Somewhat surprisingly, it is ok to have a struct that implements <code>Overwrite</code> which has fields that do not. Consider the types <code>Foo</code> and <code>Bar</code>, where <code>Foo: Overwrite</code> holds but <code>Bar: Overwrite</code> does not:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Foo</span><span class="p">(</span><span class="n">Bar</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Bar</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Overwrite</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="o">!</span><span class="n">Overwrite</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Bar</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The following code would type check:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">foo</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">Foo</span><span class="p">(</span><span class="n">Bar</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// OK: Overwriting a borrowed place of type `Foo`
</span></span></span><span class="line"><span class="cl"><span class="c1">// and `Foo: Overwrite` holds.
</span></span></span><span class="line"><span class="cl"><span class="o">*</span><span class="n">foo</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Foo</span><span class="p">(</span><span class="n">Bar</span><span class="p">);</span><span class="w">
</span></span></span></code></pre></div><p>However, the following code would not:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">foo</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">Foo</span><span class="p">(</span><span class="n">Bar</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// ERROR: Overwriting a borrowed place of type `Bar`
</span></span></span><span class="line"><span class="cl"><span class="c1">// but `Bar: Overwrite` does not hold.
</span></span></span><span class="line"><span class="cl"><span class="n">foo</span><span class="p">.</span><span class="mi">0</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Bar</span><span class="p">;</span><span class="w">
</span></span></span></code></pre></div><p>Types that do not implement <code>Overwrite</code> can therefore still be overwritten in memory, but only as part of overwriting the value in which they are embedded. In the FAQ I show how this non-infectious property preserves expressiveness.<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup></p>
<h3 id="who-implements-overwrite">Who implements <code>Overwrite</code>?</h3>
<p>This section walks through which types should implement <code>Overwrite</code>.</p>
<h4 id="copy-implies-overwrite"><code>Copy</code> implies <code>Overwrite</code></h4>
<p>Any type that implements <code>Copy</code> would automatically implement <code>Overwrite</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Copy</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Overwrite</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>(If you, like me, get nervous when you see blanket impls due to coherence concerns, it’s worth noting that <a href="https://rust-lang.github.io/rfcs/1268-allow-overlapping-impls-on-marker-traits.html">RFC #1268</a> allows for overlapping impls of marker traits, though that RFC is not yet fully implemented nor stable. It’s not terribly relevant at the moment anyway.)</p>
<h4 id="pointer-types-are-overwrite">“Pointer” types are <code>Overwrite</code></h4>
<p>Types that represent pointers all implement <code>Overwrite</code> for all <code>T</code>:</p>
<ul>
<li><code>&amp;T</code></li>
<li><code>&amp;mut T</code></li>
<li><code>Box&lt;T&gt;</code></li>
<li><code>Rc&lt;T&gt;</code></li>
<li><code>Arc&lt;T&gt;</code></li>
<li><code>*const T</code></li>
<li><code>*mut T</code></li>
</ul>
<h4 id="dyn-and-other-unsized-types-do-not-implement-overwrite"><code>dyn</code>,<code>[]</code>, and other “unsized” types do not implement <code>Overwrite</code></h4>
<p>Types that do not have a static size, like <code>dyn</code> and <code>[]</code>, do not implement <code>Overwrite</code>. Safe Rust already disallows writing code like <code>*x = …</code> in such cases.</p>
<p>There are ways to do overwrites with unsized types in unsafe code, but they’d have to prove various bounds. For example, overwriting a <code>[u32]</code> value could be ok, but you have to know the length of data. Similarly swapping two <code>dyn Value</code> referents can be safe, but you have to know that (a) both dyn values have the same underlying type and (b) that type implements <code>Overwrite</code>.</p>
<h4 id="structs-and-enums">Structs and enums</h4>
<p>The question of whether structs and enums should implement <code>Overwrite</code> is complicated because of backwards compatibility. I’m going to distinguish two cases: Rust 2021, and Rust Next, which is Rust in some hypothetical future edition (surely not 2024, but maybe the one after that).</p>
<p><strong>Rust 2021.</strong> Struct and enum types in Rust 2021 implement <code>Overwrite</code> by default. Structs could opt-out from <code>Overwrite</code> with an explicit negative impl (<code>impl !Overwrite for S</code>).</p>
<p><strong>Integrating <code>mut</code> fields.</strong> Structs that have opted out from <code>Overwrite</code> require mutable fields to be declared as <code>mut</code>. Fields not declared as <code>mut</code> are immutable. This gives them the nicer borrow check behavior.<sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup></p>
<p><strong>Rust Next.</strong> In some future edition, we can swap the default, with fields being <code>!Overwrite</code> by default and having to opt-in to enable overwrites. This would make the nice borrow check behavior the default.</p>
<h4 id="futures-and-closures">Futures and closures</h4>
<p>Futures and closures can implement <code>Overwrite</code> iff their captured values implement <code>Overwrite</code>, though in future editions it would be best if they simple do not implement <code>Overwrite</code>.</p>
<h3 id="default-bounds-and-backwards-compatibility">Default bounds and backwards compatibility</h3>
<p>The other big backwards compatibility issue has to do with default bounds. In Rust 2021, every type parameter declared as <code>T</code> implicitly gets a <code>T: Sized</code> bound. We would have to extend that default to be <code>T: Sized + Overwrite</code>. This also applies to associated types in trait definitions and <code>impl X</code> types.<sup id="fnref:7"><a href="#fn:7" class="footnote-ref" role="doc-noteref">7</a></sup></p>
<p>Interestingly, type parameters declared as <code>T: ?Sized</code> <em>also</em> opt-out from <code>Overwrite</code>. Why is that? Well, remember that <code>Overwrite: Sized</code>, so if <code>T</code> is not known to be <code>Sized</code>, it cannot be known to be <code>Overwrite</code> either. This is actually a big win. It means that types like <code>&amp;T</code> and <code>Box&lt;T&gt;</code> can work with “non-overwrite” types out of the box.</p>
<h4 id="associated-type-bounds-are-annoying-but-perhaps-not-fatal">Associated type bounds are annoying, but perhaps not fatal</h4>
<p>Still, the fact that default bounds apply to associated types and <code>impl Trait</code> is a pain in the neck. For example, it implies that <code>Iterator::Item</code> would require its items to be <code>Overwrite</code>, which would prevent you from authoring iterators that iterate over structs with immutable fields. This can to some extent be overcome by associated type aliases<sup id="fnref:8"><a href="#fn:8" class="footnote-ref" role="doc-noteref">8</a></sup> (we could declare <code>Item</code> to be a “virtual associated type”, mapping to <code>Item2021</code> in older editions, which require <code>Overwrite</code>, and <code>ItemNext</code> in newer ones, which do not).</p>
<h2 id="frequently-asked-questions">Frequently asked questions</h2>
<h3 id="omg-endless-words-what-did-i-just-read">OMG endless words. What did I just read?</h3>
<p>Let me recap!</p>
<ul>
<li>It would be more declarative and create fewer borrow check conflicts if we had users declare their fields as <code>mut</code> when they may be mutated and we were able to assume that non-<code>mut</code> fields will never be mutated.
<ul>
<li>If we were to add this, in the current Rust edition it would obviously be opt-in.</li>
<li>But in a future Rust edition it would become mandatory to declare fields as <code>mut</code> if you want to mutate them.</li>
</ul>
</li>
<li>But to do that, we need to prevent overwrites and swaps. We can do that by introducing a trait, <code>Overwrite</code>, that is required to a given location.
<ul>
<li>In the current Rust edition, this trait would be added by default to all type parameters, associated types, and <code>impl Trait</code> bounds; it would be implemented by all structs, enums, and unions.</li>
<li>In a future Rust edition, the trait would no longer be the default, and structs, enums, and unions would have to explicitly implement if they want to be overwriteable.</li>
</ul>
</li>
</ul>
<h3 id="this-change-doesnt-seem-worth-it-just-to-get-immutable-fields-is-there-more">This change doesn&rsquo;t seem worth it just to get immutable fields. Is there more?</h3>
<p>But wait, there’s more! Oh, you just said that. Yes, there’s more. I’m going to write a follow-up post showing how opting out from <code>Overwrite</code> eliminates most of the ergonomic pain of using <code>Pin</code>.</p>
<h3 id="in-rust-next-who-would-ever-implement-overwrite-manually">In “Rust Next”, who would ever implement <code>Overwrite</code> manually?</h3>
<p>I said that, in Rust Next, types should be <code>!Overwrite</code> by default and require people to implement <code>Overwrite</code> manually if they want to. But who would ever do that? It’s a good question, because I don’t think there’s very much reason to.</p>
<p>Because <code>Overwrite</code> is not infectious, you can actually make a wrapper type&hellip;</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cp">#[repr(transparent)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">ForceOverwrite</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">t</span>: <span class="nc">T</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Overwrite</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">ForceOverwrite</span><span class="w"> </span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>&hellip;and now you can put values of any type <code>X</code> into an <code>ForceOverwrite &lt;X&gt;</code> which can be reassigned.</p>
<p>This pattern allows you to make “local” use of overwrite, for example to implement a sorting algorithm (which has to do a lot of swapping). You could have a <code>sort</code> function that takes an <code>&amp;mut [T]</code> for any <code>T: Ord</code> (<code>Overwrite</code> not required):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">sort</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Ord</span><span class="o">&gt;</span><span class="p">(</span><span class="n">data</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">])</span><span class="w">
</span></span></span></code></pre></div><p>Internally, it can safely transmute the <code>&amp;mut [T]</code> to a <code>&amp;mut [ForceOverwrite&lt;T&gt;]</code> and sort <em>that</em>. Note that at no point during that sorting are we moving or overwriting an element while it is borrowed (the slice that owns it is borrowed, but not the elements themselves).</p>
<h3 id="what-is-the-relationship-of-overwrite-and-unpin">What is the relationship of <code>Overwrite</code> and <code>Unpin</code>?</h3>
<p>I’m still puzzling that over myself. I think that <code>Overwrite</code> is “morally the same” as <code>Unpin</code>, but it is much more powerful (and ergonomic) because it is integrated into the behavior of <code>&amp;mut</code> (of course, this comes at the cost of a complex backwards compatibility story).</p>
<p>Let me describe it this way. Types that do not implement <code>Overwrite</code> cannot be overwritten while borrowed, and hence are “pinned for the duration of the borrow”. This has always been true for <code>&amp;T</code>, but for <code>&amp;mut T</code> has traditionally not been true. We&rsquo;ll see in the next post that <code>Pin&lt;&amp;mut T&gt;</code> basically just extends that guarantee to apply indefinitely.</p>
<p>Compare that to types that do not implement <code>Unpin</code> and hence are “address sensitive”. Such types are pinned for the duration of a <code>Pin&lt;&amp;mut T&gt;</code>. Unlike <code>T: !Overwrite</code> types, they are <em>not</em> pinned by <code>&amp;mut T</code> references, but that’s a bug, not a feature: this is why <code>Pin</code> has to bend over backwards to prevent you from getting your hands on an <code>&amp;mut T</code>.</p>
<p>I’ll explain this more in my next post, of course.</p>
<h4 id="should-overwrite-be-an-auto-trait">Should <code>Overwrite</code> be an auto trait?</h4>
<p>I think not. If we did so, it would lock people into semver hazards in the “Rust Next” edition where <code>mut</code> is mandatory for mutation. Consider a <code>struct Foo { value: u32 }</code> type. This type has not opted into becoming <code>Copy</code>, but it only contains types that are <code>Copy</code> and therefore <code>Overwrite</code>. By <em>auto trait</em> rules it would by default be <code>Overwrite</code>. But that would prevent you from adding a <code>mut</code> field in the future or benefit from immutable fields. This is why I said the default would just be <code>!Overwrite</code>, no matter the field types.</p>
<h2 id="conclusion">Conclusion</h2>
<p><img src="https://i.giphy.com/media/v1.Y2lkPTc5MGI3NjExd3cxYWNibXp5NnpyaW0xcTMyY3Rhdms3em00cWJjc3Y2NnYzdDJ2cSZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9dg/MknHSvehUtqfYMClV6/giphy.gif" alt="Obama Mic Drop"></p>
<p>=)</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>After this grandiose intro, hopefully I won&rsquo;t be printing a retraction of the idea due to some glaring flaw&hellip; eep!&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Whenever I saw immutable here, I mean immutable-modulo-<a href="https://doc.rust-lang.org/std/cell/struct.Cell.html"><code>Cell</code></a>, of course. We should probably find another word for that, this is kind of terminology debt that Rust has bought its way into and I’m not sure the best way for us to get out!&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Immutable fields don&rsquo;t resolve <em>all</em> inter-function borrow conflicts. To do that, you need something like <a href="https://smallcultfollowing.com/babysteps/blog/2021/11/05/view-types/">view types</a>. But in my experience they would eliminate many.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>The simple solution — if a struct has <code>mut</code> fields, disallow overwriting it — is basically what C++ does with their <code>const</code> fields. Classes or structs with <code>const</code> fields are more limited in how you can use them. This works in C++ because they don’t wait until post-substitution to check templates for validity.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>I love the <a href="https://jgbm.github.io/eecs762f19/papers/felleisen.pdf">Felleisen definition of “expressiveness”</a>: two language features are equally expressive if one can be converted into the other with only <em>local</em> rewrites, which I generally interpret as “rewrites that don’t affect the function signature (or other abstraction boundary)”.&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p>We can also make the <code>!Overwrite</code> impl implied by declaring fields <code>mut</code>, of course. This is fine for backwards compatibility, but isn’t the design I would want long-term, since it introduces an odd “step change” where declaring one field as <code>mut</code> implicitly declares all <em>other</em> fields as immutable (and, conversely, deleting the <code>mut</code> keyword from that field has the effect of declaring all fields, including that one, as mutable).&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:7">
<p>The <code>Self</code> type in traits is exempt from the <code>Sized</code> default, and it could be exempt from the <code>Overwrite</code> default as well, unless the trait is declared as <code>Sized</code>.&#160;<a href="#fnref:7" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:8">
<p>Hat tip to TC, who pointed this out to me.&#160;<a href="#fnref:8" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/overwrite-trait" term="overwrite-trait" label="Overwrite trait"/></entry><entry><title type="html">More thoughts on claiming</title><link href="https://smallcultfollowing.com/babysteps/blog/2024/06/26/claim-followup-1/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2024/06/26/claim-followup-1/</id><published>2024-06-26T00:00:00+00:00</published><updated>2024-06-26T08:20:43-04:00</updated><content type="html"><![CDATA[<p>This is the first of what I think will be several follow-up posts to <a href="https://smallcultfollowing.com/babysteps/
/blog/2024/06/21/claim-auto-and-otherwise/">&ldquo;Claiming, auto and otherwise&rdquo;</a>. This post is focused on clarifying and tweaking the design I laid out previously in response to some of the feedback I&rsquo;ve gotten. In future posts I want to lay out some of the alternative designs I&rsquo;ve heard.</p>
<h2 id="tldr-people-like-it">TL;DR: People like it</h2>
<p>If there&rsquo;s any one thing I can take away from what I&rsquo;ve heard, is that people really like the idea of making working with reference counted or cheaply cloneable data more ergonomic than it is today. A lot of people have expressed a lot of excitement.</p>
<p>If you read only one additional thing from the post—well, don&rsquo;t do that, but if you <em>must</em>—read the <a href="#conclusion">Conclusion</a>. It attempts to restate what I was proposing to help make it clear.</p>
<h2 id="clarifying-the-relationship-of-the-traits">Clarifying the relationship of the traits</h2>
<p>I got a few questions about the relationship of the Copy/Clone/Claim traits to one another. I think the best way to show it is with a venn diagram:</p>
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" width="300" height="300" viewBox="0 0 1080 1080" xml:space="preserve">

  <style>
    .heavy {
      font: 70px sans-serif;
    }
  </style>

<g transform="matrix(1 0 0 1 540 540)" id="ba299d2e-6da0-4fbf-8b73-fe1551fc4ac6"  >
</g>
<g transform="matrix(1 0 0 1 540 540)" id="1a096556-e035-4948-bda7-cdfa086a6429"  >
<rect style="stroke: none; stroke-width: 1; stroke-dasharray: none; stroke-linecap: butt; stroke-dashoffset: 0; stroke-linejoin: miter; stroke-miterlimit: 4; fill: rgb(255,255,255); fill-rule: nonzero; opacity: 1; visibility: hidden;" vector-effect="non-scaling-stroke"  x="-540" y="-540" rx="0" ry="0" width="1080" height="1080" />
</g>
<g transform="matrix(13.37 0 0 13.37 541.67 541.67)" id="88b2d636-c1ba-4f61-ac16-7892c3ee8051"  >
<circle style="stroke: rgb(0,0,0); stroke-width: 0; stroke-dasharray: none; stroke-linecap: butt; stroke-dashoffset: 0; stroke-linejoin: miter; stroke-miterlimit: 4; fill: rgb(180,180,180); fill-rule: nonzero; opacity: 1;" vector-effect="non-scaling-stroke"  cx="0" cy="0" r="35" />
</g>
<g transform="matrix(6.74 0 0 6.74 404.3 565.37)" id="290c2dfd-0ee6-418a-8711-86389a0e2ecc"  >
<circle style="stroke: rgb(0,0,0); stroke-width: 0; stroke-dasharray: none; stroke-linecap: butt; stroke-dashoffset: 0; stroke-linejoin: miter; stroke-miterlimit: 4; fill: rgb(0,45,255); fill-rule: nonzero; opacity: 0.7;" vector-effect="non-scaling-stroke"  cx="0" cy="0" r="35" />
</g>
<g transform="matrix(6.49 0 0 6.49 699.6 569.67)" id="9006d9ac-4678-4099-af62-5692884165aa"  >
<circle style="stroke: rgb(0,0,0); stroke-width: 0; stroke-dasharray: none; stroke-linecap: butt; stroke-dashoffset: 0; stroke-linejoin: miter; stroke-miterlimit: 4; fill: rgb(255,8,177); fill-rule: nonzero; opacity: 0.7;" vector-effect="non-scaling-stroke"  cx="0" cy="0" r="35" />
</g>
<g transform="matrix(1 0 0 1 540 216.07)" style="" id="d417f05c-8db2-4516-a4f3-7dee5f72096f"  >
		<text xml:space="preserve" class="heavy"><tspan x="-80.5" y="27.96" >Clone</tspan></text>
</g>
<g transform="matrix(1 0 0 1 339.71 557.45)" style="" id="e06f084b-f2d6-4909-a40d-da3467e12ca6"  >
		<text xml:space="preserve" class="heavy"><tspan x="-80.48" y="21.99" >Copy</tspan></text>
</g>
<g transform="matrix(1 0 0 1 762.98 569.29)" style="" id="ade70649-ef99-4687-841d-48e71eefc645"  >
		<text xml:space="preserve" class="heavy"><tspan x="-86.65" y="21.99" >Claim</tspan></text>
</g>
</svg>

<ul>
<li>The <code>Clone</code> trait is the most general, representing any way of duplicating the value. There are two important subtraits:
<ul>
<li><code>Copy</code> represents values that can be cloned via memcpy and which lack destructors (&ldquo;plain old data&rdquo;).</li>
<li><code>Claim</code> represents values whose clones are cheap, infallible, and transparent; on the basis of these properties, claims are inserted automatically by the compiler.</li>
</ul>
</li>
</ul>
<p><code>Copy</code> and <code>Claim</code> overlap but do not have a strict hierarchical relationship. Some <code>Claim</code> types (like <code>Rc</code> and <code>Arc</code>) are not &ldquo;plain old data&rdquo;. And while all <code>Copy</code> operations are infallible, some of them fail to meet claims other conditions:</p>
<ul>
<li>Copying a large type like <code>[u8; 1024]</code> is not cheap.</li>
<li>Copying a type with interior mutability like <code>Cell&lt;u8&gt;</code> is not transparent.</li>
</ul>
<h2 id="on-heuristics">On heuristics</h2>
<p>One challenge with the <code>Claim</code> trait is that the choice to implement it involves some heuristics:</p>
<ul>
<li>What exactly is <em>cheap?</em> I tried to be specific by saying &ldquo;O(1) and doesn&rsquo;t copy more than a few cache lines&rdquo;, but clearly it will be hard to draw a strict line.</li>
<li>What exactly is <em>infallible?</em> It was pointed out to me that <code>Arc</code> will abort if the ref count overflows (which is one reason why the Rust-for-Linux project <a href="https://rust-for-linux.com/arc-in-the-linux-kernel">rolled their own alternative</a>). And besides, any Rust code can abort on stack overflow. So clearly we need to have some reasonable compromise.</li>
<li>What exactly is <em>transparent?</em> Again, I tried to specify it, but iterator types are an example of types that are <em>technically</em> transparent to copy but where it is nontheless very confusing to claim them.</li>
</ul>
<p>An aversion to heuristics is the reason we have the current copy/clone split. We couldn&rsquo;t figure out where to draw the line (&ldquo;how much data is too much?&rdquo;) so we decided to simply make it &ldquo;memcpy or custom code&rdquo;. This was a reasonable starting point, but we&rsquo;ve seen that it is imperfect, leading to uncomfortable compromises.</p>
<p>The thing about &ldquo;cheap, infallible, and transparent&rdquo; is that I think it represents <strong>exactly</strong> the criteria that we really want to represent when something can be automatically claimed. And it seems inherent that those criteria are a bit squishy.</p>
<p>One implication of this is that <code>Claim</code> should rarely if ever appear as a bound on a function. Writing <code>fn foo&lt;T: Claim&gt;(t: T)</code> doesn&rsquo;t really feel like it adds a lot of value to me, since, given the heuristical nature of claim, it&rsquo;s going to rule out some uses that may make sense. <a href="https://github.com/eternaleye">eternaleye</a> proposed an <a href="https://github.com/nikomatsakis/babysteps/issues/43">interesting twist</a> on the original proposal, suggesting we introducing stricter versions of <code>Claim</code> for, say, O(1) <code>Clone</code>, although I don&rsquo;t yet see what code would want to use that as a bound either.</p>
<h2 id="infallible-ought-to-be-does-not-unwind-and-we-ought-to-abort-if-it-does">&ldquo;Infallible&rdquo; ought to be &ldquo;does not unwind&rdquo; (and we ought to abort if it does)</h2>
<p>I originally laid out the conditions for claim as &ldquo;cheap, infallible, and transparent&rdquo;, where &ldquo;infallible&rdquo; means &ldquo;cannot panic or abort&rdquo;. But it was pointed out to me that <code>Arc</code> and <code>Rc</code> in the standard library will indeed abort if the ref-count exceeds <code>std::usize::MAX</code>! This obviously can&rsquo;t work, since reference counted values are the prime candidate to implement <code>Claim</code>.</p>
<p>Therefore, I think infallible ought to say that &ldquo;Claim operations should never panic&rdquo;. This almost doesn&rsquo;t need to be said, since panics are <strong>already</strong> meant to represent impossible or extraordinarily unlikely conditions, but it seems worth reiterating since it is particularly important in this case.</p>
<p>In fact, I think we should go further and have the compiler insert an abort if an automatic <code>claim</code> operation does unwind.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> My reasoning here is the same as I gave in my <a href="https://smallcultfollowing.com/babysteps/
/blog/2024/05/02/unwind-considered-harmful/">post on unwinding</a><sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>:</p>
<ul>
<li>Reasoning about unwinding is already very hard, it becomes nigh impossible if the sources of unwinding are hidden.</li>
<li>It would make for more efficient codegen if the compiler doesn&rsquo;t have to account for unwinding, which would make code using <code>claim()</code> (automatically or explicitly) mildly more efficient than code using <code>clone()</code>.</li>
</ul>
<p>I was originally thinking of the Rust For Linux project when I wrote the wording on infallible, but their requirements around aborting are really orthogonal and much broader than <code>Claim</code> itself. They already don&rsquo;t use the Rust standard library, or most dependencies, because they want to limit themselves to code that treats abort as an absolute last resort. Rather than abort on overflow, their version of reference counting opts simply to leak, for example, and their memory allocators return a <code>Result</code> to account for OOM conditions. I think the <code>Claim</code> trait will work just fine for them whatever we say on this point, as they&rsquo;ll already have to screen for code that meets their more stringent criteria.</p>
<h2 id="clarifying-claim-codegen">Clarifying <code>claim</code> codegen</h2>
<p>In my post, I noted almost in passing that I would expect the compiler to still use memcpy at monomorphization time when it knew that the type being claimed implements <code>Copy</code>. One interesting bit of feedback I got was anecdotal evidence that this will indeed be cricital for performance.</p>
<p>To model the semantics I want for <code>claim</code> we would need specialization<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>. I&rsquo;m going to use a variant of specialized that <a href="https://github.com/lcnr">lcnr</a> first proposed to me; the idea is to have an <code>if impl</code> expression that, at monomorphization time, either takes the <code>true</code> path (if the type implements <code>Foo</code> via <a href="https://smallcultfollowing.com/babysteps/blog/2018/02/09/maximally-minimal-specialization-always-applicable-impls/">always applicable</a> impls) or the <code>false</code> path (otherwise). This is a cleaner formulation for specialization when the main thing you want to do is provide more optimized or alternative implementations.</p>
<p>Using that, we could write a function <code>use_claim_value</code> that defines the code the compiler should insert:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">use_claim_value</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">Claim</span><span class="o">&gt;</span><span class="p">(</span><span class="n">t</span>: <span class="kp">&amp;</span><span class="nc">T</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">T</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">std</span>::<span class="n">panic</span>::<span class="n">catch_unwind</span><span class="p">(</span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">if</span><span class="w"> </span><span class="k">impl</span><span class="w"> </span><span class="n">T</span>: <span class="nb">Copy</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// Copy T if we can
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="o">*</span><span class="n">t</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// Otherwise clone
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">t</span><span class="p">.</span><span class="n">clone</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}).</span><span class="n">unwrap_or_else</span><span class="p">(</span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// Do not allow unwinding
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">abort</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">})</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This has three important properties:</p>
<ul>
<li>No unwinding, for easier reasoning and better codegen.</li>
<li>Copies if it can.</li>
<li>Always calls <code>clone</code> otherwise.</li>
</ul>
<h2 id="conclusion">Conclusion</h2>
<h3 id="what-i-really-proposed">What I really proposed</h3>
<p>Effectively I proposed to change what it means to &ldquo;use something by value&rdquo; in Rust. This has always been a kind of awkward concept in Rust without a proper name, but I&rsquo;m talking about what happens to the value <code>x</code> in any of these scenarios:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">x</span>: <span class="nc">SomeType</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// Scenario A: passing as an argument
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">consume</span><span class="p">(</span><span class="n">x</span>: <span class="nc">SomeType</span><span class="p">)</span><span class="w"> </span><span class="p">{}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">consume</span><span class="p">(</span><span class="n">x</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// Scenario B: assigning to a new place
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">x</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// Scenario C: captured by a &#34;move&#34; closure
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">c</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">move</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="n">x</span><span class="p">.</span><span class="n">operation</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// Scenario D: used in a non-move closure
</span></span></span><span class="line"><span class="cl"><span class="c1">// in a way that requires ownership
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">d</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="n">consume</span><span class="p">(</span><span class="n">x</span><span class="p">);</span><span class="w">
</span></span></span></code></pre></div><p>No matter which way you do it, the rules today are the same:</p>
<ul>
<li>If <code>SomeType: Copy</code>, then <code>x</code> is <em>copied</em>, and you can go on using it later.</li>
<li>Else, <code>x</code> is <em>moved</em>, and you cannot.</li>
</ul>
<p>I am proposing that, modulo the staging required for backwards compatibility, we change those rules to the following:</p>
<ul>
<li>If <code>SomeType: Claim</code>, then <code>x</code> is <em>claimed</em>, and you can go on using it later.</li>
<li>Else, <code>x</code> is <em>moved</em>, and you cannot.</li>
</ul>
<p>To a first approximation, &ldquo;claiming&rdquo; something means calling <code>x.claim()</code> (which is the same as <code>x.clone()</code>). But in reality we can be more efficient, and the definition I would use is as follows:</p>
<ul>
<li>If the compiler sees <code>x</code> is &ldquo;live&rdquo; (may be used again later), it transforms the use of <code>x</code> to <code>use_claimed_value(&amp;x)</code> (<a href="#clarifying-claim-codegen">as defined earlier</a>).</li>
<li>If <code>x</code> is dead, then it is just moved.</li>
</ul>
<h3 id="why-i-proposed-it">Why I proposed it</h3>
<p>There&rsquo;s a reason I proposed this change in the way that I did. I really value the way Rust handles &ldquo;by value consumption&rdquo; in a consistent way across all those contexts. It fits with Rust&rsquo;s ethos of orthogonal, consistent rules that fit together to make a harmonious, usable whole.</p>
<p>My goal is to retain Rust&rsquo;s consistency while also improving the gaps in the current rule, which neither highlights the things I want to pay attention to (large copies), hides the things I (almost always) don&rsquo;t (reference count increments), nor covers all the patterns I sometimes want (e.g., being able to <code>get</code> and <code>set</code> a <code>Cell&lt;Range&lt;u32&gt;&gt;</code>, which doesn&rsquo;t work today because making <code>Range&lt;u32&gt;: Copy</code> would introduce footguns). My <em>hope</em> is that we can do this in a way that it benefits most every Rust program, whether it be low-level or high-level in nature.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>In fact, I wonder if we could extend <a href="https://github.com/rust-lang/rfcs/pull/3288">RFC #3288</a> to apply this retroactively to all operations invoked automatically by the compiler, like <code>Deref</code>, <code>DerefMut</code>, and <code>Drop</code>. Obviously this is technically backwards incompatible, but the benefits here could well be worth it in my view, and the code impacted seems very small (who intentionally panics in <code>Deref</code>?).&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Another blog post for which I ought to post a follow-up!&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Specialization has definitely acquired that &ldquo;vaporware&rdquo; reputation and for good reason—but I still think we can add it! That said, my thinking on the topic has evolved quite a bit. It&rsquo;d be worth another post sometime. /me adds it to the queue.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/claim" term="claim" label="Claim"/><category scheme="https://smallcultfollowing.com/babysteps/series/ergonomic-rc" term="ergonomic-rc" label="Ergonomic RC"/></entry><entry><title type="html">Claiming, auto and otherwise</title><link href="https://smallcultfollowing.com/babysteps/blog/2024/06/21/claim-auto-and-otherwise/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2024/06/21/claim-auto-and-otherwise/</id><published>2024-06-21T00:00:00+00:00</published><updated>2024-06-21T07:21:21-04:00</updated><content type="html"><![CDATA[<p>This blog post proposes adding a third trait, <code>Claim</code>, that would live alongside <code>Copy</code> and <code>Clone</code>. The goal of this trait is to improve Rust&rsquo;s existing split, where types are categorized as either <code>Copy</code> (for <a href="https://en.wikipedia.org/wiki/Passive_data_structure">&ldquo;plain old data&rdquo;</a><sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> that is safe to <code>memcpy</code>) and <code>Clone</code> (for types that require executing custom code or which have destructors). This split has served Rust fairly well but also has some shortcomings that we&rsquo;ve seen over time, including maintenance hazards, performance footguns, and (at times quite significant) ergonomic pain and user confusion.</p>
<h2 id="tldr">TL;DR</h2>
<p>The proposal in this blog post has three phases:</p>
<ol>
<li><strong>Adding a new <code>Claim</code> trait</strong> that refines <code>Clone</code> to identify &ldquo;cheap, infallible, and transparent&rdquo; clones (see below for the definition, but it explicitly excludes allocation). Explicit calls to <code>x.claim()</code> are therefore known to be cheap and easily distinguished from calls to <code>x.clone()</code>, which may not be. This makes code easier to understand and addresses existing maintenance hazards (<a href="#How-did-you-come-up-with-the-name-Claim">obviously we can bikeshed the name</a>).</li>
<li><strong>Modifying the borrow checker to insert calls to <code>claim()</code> when using a value from a place that will be used later.</strong> So given e.g. a variable <code>y: Rc&lt;Vec&lt;u32&gt;&gt;</code>, an assignment like <code>x = y</code> would be transformed to <code>x = y.claim()</code> if <code>y</code> is used again later. This addresses the ergonomic pain and user confusion of reference-counted values in rust today, especially in connection with closures and async blocks.</li>
<li><strong>Finally, disconnect <code>Copy</code> from &ldquo;moves&rdquo; altogether, first with warnings (in the current edition) and then errors (in Rust 2027).</strong> In short, <code>x = y</code> would move <code>y</code> unless <code>y: Claim</code>. Most <code>Copy</code> types would also be <code>Claim</code>, so this is largely backwards compatible, but it would let us rule out cases like <code>y: [u8; 1024]</code> and also extend <code>Copy</code> to types like <code>Cell&lt;u32&gt;</code> or iterators without the risk of <a href="#Some-things-that-should-implement-Copy-do-not">introducing subtle bugs</a>.</li>
</ol>
<p>For some code, automatically calling <code>Claim</code> may be undesirable. For example, some data structure definitions track reference count increments closely. <strong>I propose to address this case by creating a &ldquo;allow-by-default&rdquo; <code>automatic-claim</code> lint that crates or modules can opt-into so that all &ldquo;claims&rdquo; can be made explicit</strong>. This is more-or-less the <a href="https://smallcultfollowing.com/babysteps/blog/2023/09/30/profiles/">profile pattern</a>, although I think it&rsquo;s notable here that the set of crates which would want &ldquo;auto-claim&rdquo; do not necessarily fall into neat categories, as I will discuss.</p>
<h2 id="step-1-introducing-an-explicit-claim-trait">Step 1: Introducing an explicit <code>Claim</code> trait</h2>
<p>Quick, reading this code, can you tell me anything about it&rsquo;s performance characteristics?</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">tokio</span>::<span class="n">spawn</span><span class="p">({</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Clone `map` and store it into another variable
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// named `map`. This new variable shadows the original.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// We can now write code that uses `map` and then go on
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// using the original afterwards.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">map</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">map</span><span class="p">.</span><span class="n">clone</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">move</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="cm">/* code using map */</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="cm">/* more code using map */</span><span class="w">
</span></span></span></code></pre></div><p>Short answer: no, you can&rsquo;t, not without knowing the type of <code>map</code>. The call to <code>map.clone()</code> may just be cloning a large map or incrementing a reference count, you can&rsquo;t tell.</p>
<h3 id="one-clone-fits-all-creates-a-maintenance-hazard">One-clone-fits-all creates a maintenance hazard</h3>
<p>When you&rsquo;re in the midst of writing code, you tend to have a good idea whether a given value is &ldquo;cheap to clone&rdquo; or &ldquo;expensive&rdquo;. But this property can change over the lifetime of the code. Maybe <code>map</code> starts out as an <code>Rc&lt;HashMap&lt;K, V&gt;&gt;</code> but is later refactored to <code>HashMap&lt;K, V&gt;</code>. A call to <code>map.clone()</code> will still compile but with very different performance characteristics.</p>
<p>In fact, <code>clone</code> can have an effect on the program&rsquo;s <em>semantics</em> as well. Imagine you have a variable <code>c: Rc&lt;Cell&lt;u32&gt;&gt;</code> and a call <code>c.clone()</code>. Currently this creates another handle to the same underlying cell. But if you refactor <code>c</code> to <code>Cell&lt;u32&gt;</code>, that call to <code>c.clone()</code> is now creating an independent cell. Argh. (We&rsquo;ll see this theme, of the importance of distinguishing interior mutability, come up again later.)</p>
<h3 id="proposal-an-explicit-claim-trait-distinguishing-cheap-infallible-transparent-clones">Proposal: an explicit <code>Claim</code> trait distinguishing &ldquo;cheap, infallible, transparent&rdquo; clones</h3>
<p>Now imagine we introduced a new trait <code>Claim</code>. This would be a subtrait of <code>Clone</code>that indicates that cloning is:</p>
<ul>
<li><strong>Cheap:</strong> Claiming should complete in O(1) time and avoid copying more than a few cache lines (64-256 bytes on current arhictectures).</li>
<li><strong>Infallible:</strong> Claim should not encounter failures, even panics or aborts, under any circumstances. <strong>Memory allocation is not allowed</strong>, as it can abort if memory is exhausted.</li>
<li><strong>Transparent:</strong> The old and new value should behave the same with respect to their public API.</li>
</ul>
<p>The trait itself could be defined like so:<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Claim</span>: <span class="nb">Clone</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">claim</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">clone</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now when I see code calling <code>map.claim()</code>, even without knowing what the type of <code>map</code> is, I can be reasonably confident that this is a &ldquo;cheap clone&rdquo;. Moreover, if my code is refactored so that <code>map</code> is no longer ref-counted, I will start to get compilation errors, letting me decide whether I want to <code>clone</code> here (potentially expensive) or find some other solution.</p>
<h2 id="step-2-claiming-values-in-assignments">Step 2: Claiming values in assignments</h2>
<p>In Rust today, values are moved when accessed unless their type implement the <code>Copy</code> trait. This means (among other things) that given a ref-counted <code>map: Rc&lt;HashMap&lt;K, V&gt;&gt;</code>, using the value <code>map</code> will mean that I can&rsquo;t use <code>map</code> anymore. So e.g. if I do <code>some_operation(map)</code>, then gives my handle to <code>some_operation</code>, preventing me from using it again.</p>
<h3 id="not-all-memcopies-should-be-quiet">Not all memcopies should be &lsquo;quiet&rsquo;</h3>
<p>The intention of this rule is that something as simple as <code>x = y</code> should correspond to a simple operation at runtime (a memcpy, specifically) rather than something extensible. That, I think, is laudable. And yet the current rule in practice has some issues:</p>
<ul>
<li>First, <code>x = y</code> can still result in surprising things happening at runtime. If <code>y: [u8; 1024]</code>, for example, then a few simple calls like <code>process1(y); process2(y);</code> can easily copy large amounts of data (you probably meant to pass that by reference).</li>
<li>Second, seeing <code>x = y.clone()</code> (or even <code>x = y.claim()</code>) is visual clutter, distracting the reader from what&rsquo;s really going on. In most applications, incrementing ref counts is simply not that interesting that it needs to be called out so explicitly.</li>
</ul>
<h3 id="some-things-that-should-implement-copy-do-not">Some things that should implement <code>Copy</code> do not</h3>
<p>There&rsquo;s a more subtle problem: the current rule means adding <code>Copy</code> impls can create correctness hazards. For example, many iterator types like <code>std::ops::Range&lt;u32&gt;</code> and <code>std::vec::Iter&lt;u32&gt;</code> could well be <code>Copy</code>, in the sense that they are safe to memcpy. And that would be cool, because you could put them in a <code>Cell</code> and then use <code>get</code>/<code>set</code> to manipulate them. But we don&rsquo;t implement <code>Copy</code> for those types because <a href="https://github.com/rust-lang/rust/issues/18045">it would introduce a subtle footgun</a>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">iter0</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">vec</span><span class="p">.</span><span class="n">iter</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">iter1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">iter0</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">iter1</span><span class="p">.</span><span class="n">next</span><span class="p">();</span><span class="w"> </span><span class="c1">// does not effect `iter0`
</span></span></span></code></pre></div><p>Whether this is surprising or not depends on how well you know Rust &ndash; but definitely it would be clearer if you had to call <code>clone</code> explicitly:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">iter0</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">vec</span><span class="p">.</span><span class="n">iter</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">iter1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">iter0</span><span class="p">.</span><span class="n">clone</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">iter1</span><span class="p">.</span><span class="n">next</span><span class="p">();</span><span class="w">
</span></span></span></code></pre></div><p>Similar considerations are the <a href="https://github.com/rust-lang/rust/issues/20813">reason we have not made <code>Cell&lt;u32&gt;</code> implement <code>Copy</code></a>.</p>
<h3 id="the-clonecopy-rules-interact-very-poorly-with-closures">The clone/copy rules interact very poorly with closures</h3>
<p>The biggest source of confusion when it comes to clone/copy, however, is not about assignments like <code>x = y</code> but rather closures and async blocks. Combining ref-counted values with closures is a big stumbling block for new users. This has been true as long as I can remember. Here for example is a <a href="https://youtu.be/U3upi-y2pCk?si=kFEhRB_O_wdMKysC&amp;t=807">2014 talk at Strangeloop</a> in which the speaker devotes considerable time to the &ldquo;accidental complexity&rdquo; (their words, but I agree) they encountered navigating cloning and closures (and, I will note, how the term clone is misleading because it doesn&rsquo;t mean a deep clone). I&rsquo;m sorry to say that the situation they describe hasn&rsquo;t really improved much since then. And, bear in mind, this speaker is a skilled programmer. Now imagine a novice trying to navigate this. Oh boy.</p>
<p>But it&rsquo;s not just beginners who struggle! In fact, there isn&rsquo;t really a <em>convenient</em> way to manage the problem of having to clone a copy of a ref-counted item for a closure&rsquo;s use. At the RustNL unconf, <a href="https://github.com/jkelleyrtp/">Jonathan Kelley</a>, who heads up the <a href="https://dioxuslabs.com/">Dioxus Labs</a>, described how at CloudFlare codebase they spent significant time trying to find the most ergonomic way to thread context (and these are not Rust novices).</p>
<p>In that setting, they had a master context object <code>cx</code> that had a number of subsystems, each of which was ref-counted. Before launching a new task, they would handle out handles to the subsystems that task required (they didn&rsquo;t want every task to hold on to the entire context). They ultimately landed on a setup like this, which is still pretty painful:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">_io</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">cx</span><span class="p">.</span><span class="n">io</span><span class="p">.</span><span class="n">clone</span><span class="p">()</span>:
</span></span><span class="line"><span class="cl"><span class="nc">let</span><span class="w"> </span><span class="n">_disk</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">cx</span><span class="p">.</span><span class="n">disk</span><span class="p">.</span><span class="n">clone</span><span class="p">()</span>:
</span></span><span class="line"><span class="cl"><span class="nc">let</span><span class="w"> </span><span class="n">_health_check</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">cx</span><span class="p">.</span><span class="n">health_check</span><span class="p">.</span><span class="n">clone</span><span class="p">()</span>:
</span></span><span class="line"><span class="cl"><span class="nc">tokio</span>::<span class="n">spawn</span><span class="p">(</span><span class="k">async</span><span class="w"> </span><span class="k">move</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">do_something</span><span class="p">(</span><span class="n">_io</span><span class="p">,</span><span class="w"> </span><span class="n">_disk</span><span class="p">,</span><span class="w"> </span><span class="n">_health_check</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">})</span><span class="w">
</span></span></span></code></pre></div><p>You can make this (in my opinion) mildly better by leveraging variable shadowing, but even then, it&rsquo;s pretty verbose:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">tokio</span>::<span class="n">spawn</span><span class="p">({</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">io</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">cx</span><span class="p">.</span><span class="n">io</span><span class="p">.</span><span class="n">clone</span><span class="p">()</span>:
</span></span><span class="line"><span class="cl">    <span class="nc">let</span><span class="w"> </span><span class="n">disk</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">cx</span><span class="p">.</span><span class="n">disk</span><span class="p">.</span><span class="n">clone</span><span class="p">()</span>:
</span></span><span class="line"><span class="cl">    <span class="nc">let</span><span class="w"> </span><span class="n">health_check</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">cx</span><span class="p">.</span><span class="n">health_check</span><span class="p">.</span><span class="n">clone</span><span class="p">()</span>:
</span></span><span class="line"><span class="cl">    <span class="nc">async</span><span class="w"> </span><span class="k">move</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">do_something</span><span class="p">(</span><span class="n">io</span><span class="p">,</span><span class="w"> </span><span class="n">disk</span><span class="p">,</span><span class="w"> </span><span class="n">health_check</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">})</span><span class="w">
</span></span></span></code></pre></div><p>What you <em>really</em> want is to just write something like this, like you would in Swift or Go or most any other modern language:<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">tokio</span>::<span class="n">spawn</span><span class="p">(</span><span class="k">async</span><span class="w"> </span><span class="k">move</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">do_something</span><span class="p">(</span><span class="n">cx</span><span class="p">.</span><span class="n">io</span><span class="p">,</span><span class="w"> </span><span class="n">cx</span><span class="p">.</span><span class="n">disk</span><span class="p">,</span><span class="w"> </span><span class="n">cx</span><span class="p">.</span><span class="n">health_check</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">})</span><span class="w">
</span></span></span></code></pre></div><h3 id="autoclaim-to-the-rescue">&ldquo;Autoclaim&rdquo; to the rescue</h3>
<p>What I propose is to modify the borrow checker to automatically invoke <code>claim</code> as needed. So e.g. an expression like <code>x = y</code> would be automatically converted to <code>x = y.claim()</code> if <code>y</code> will be used again later. And closures that capture variables in their environment would respect auto-claim as well, so <code>move || process(y)</code> would become <code>{ let y = y.claim(); move || process(y) }</code> if <code>y</code> were used again later.</p>
<p>Autoclaim would not apply to the last use of a variable. So <code>x = y</code> only introduces a call to <code>claim</code> if it is needed to prevent an error. This avoids unnecessary reference counting.</p>
<p>Naturally, if the type of <code>y</code> doesn&rsquo;t implement <code>Claim</code>, we would give a suitable error explaining that this is a move and the user should insert a call to <code>clone</code> if they want to make a cloned value.</p>
<h3 id="support-opt-out-with-an-allow-by-default-lint">Support opt-out with an allow-by-default lint</h3>
<p>There is definitely some code that benefits from having the distinction between <em>moving</em> an existing handle and <em>claiming</em> a new one made explicit. For these cases, what I think we should do is add an &ldquo;allow-by-default&rdquo; <code>automatic-claim</code> lint that triggers whenever the compiler inserts a call to <code>claim</code> on a type that is not <code>Copy</code>. This is a signal that user-supplied code is running.</p>
<p>To aid in discovery, I would consider a <code>automatic-operations</code> lint group for these kind of &ldquo;almost always useful, but sometimes not&rdquo; conveniences; effectively adopting the <a href="https://smallcultfollowing.com/babysteps/blog/2023/09/30/profiles/">profile pattern</a> I floated at one point, but just by making it a lint group. Crates could then add <code>automatic-operations = 'deny&quot;</code> (bikeshed needed) in the <code>[lints]</code> section of their <code>Cargo.toml</code>.</p>
<h2 id="step-3-stop-using-copy-to-control-moves">Step 3. Stop using <code>Copy</code> to control moves</h2>
<p>Adding &ldquo;autoclaim&rdquo; addresses the ergonomic issues around having to call <code>clone</code>, but it still means that anything which is <code>Copy</code> can be, well, copied. As noted before that implies performance footguns (<code>[u8;1024]</code> is probably not something to be copied lightly) and correctness hazards (neither is an iterator).</p>
<p>The real goal should be to disconnect &ldquo;can be memcopied&rdquo; and &ldquo;can be automatically copied&rdquo;<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup>. Once we have &ldquo;autoclaim&rdquo;, we can do that, thanks to the magic of lints and editions:</p>
<ul>
<li>In Rust 2024 and before, we warn when <code>x = y</code> copies a value that is <code>Copy</code> but not <code>Claim</code>.</li>
<li>In the next Rust edition (Rust 2027, presumably), we make it a hard error so that the rule is just tied to <code>Claim</code> trait.</li>
</ul>
<p>At codegen time, I would still expect us to guarantee that <code>x = y</code> will memcpy and will not invoke <code>y.claim()</code>, since technically the <code>Clone</code> impl may not be the same behavior; it&rsquo;d be nice if we could extend this guarantee to any call to <code>clone</code>, but I don&rsquo;t know how to do that, and it&rsquo;s a separate problem. Furthermore, the <code>automatic_claims</code> lint would only apply to types that don&rsquo;t implement <code>Copy</code>.<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup></p>
<h2 id="frequently-asked-questions">Frequently asked questions</h2>
<p>All right, I&rsquo;ve laid out the proposal, let me dive into some of the questions that usually come up.</p>
<h3 id="are-you--nuts">Are you ??!@$!$! nuts???</h3>
<p>I mean, maybe? The Copy/Clone split has been a part of Rust for a long time<sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup>. But from what I can see in real codebases and daily life, the impact of this change would be a net-positive all around:</p>
<ul>
<li>For most code, they get less clutter and less confusing error messages but the same great Rust taste (i.e., no impact on reliability or performance).</li>
<li>Where desired, projects can enable the lint (declaring that they care about performance as a side benefit). Furthermore, they can distinguish calls to <code>claim</code> (cheap, infallible, transparent) from calls to <code>clone</code> (anything goes).</li>
</ul>
<p>What&rsquo;s not to like?</p>
<h3 id="what-kind-of-code-would-denyautomatic_claims">What kind of code would <code>#[deny(automatic_claims)]</code>?</h3>
<p>That&rsquo;s actually an interesting question! At first I thought this would correspond to the &ldquo;high-level, business-logic-oriented code&rdquo; vs &ldquo;low-level systems software&rdquo; distinction, but I am no longer convinced.</p>
<p>For example, I spoke with someone from Rust For Linux who felt that autoclaim would be useful, and it doesn&rsquo;t get more low-level than that! Their basic constraint is that they want to track carefully where memory allocation and other fallible operations occur, and incrementing a reference count is fine.</p>
<p>I think the real answer is &ldquo;I&rsquo;m not entirely sure&rdquo;, we have to wait and see! I suspect it will be a fairly small, specialized set of projects. This is part of why I this this is a good idea.</p>
<h3 id="well-my-code-definitely-wants-to-track-when-ref-counts-are-incremented">Well my code <em>definitely</em> wants to track when ref-counts are incremented!</h3>
<p>I totally get that! And in fact I think this proposal actually <strong>helps</strong> your code:</p>
<ul>
<li>By setting <code>#![deny(automatic_claims)]</code>, you declare up front the fact that reference counts are something you track carefully. OK, I admit not everything will consider this a pro. Regardless, it&rsquo;s a 1-time setup cost.</li>
<li>By distinguishing <code>claim</code> from <code>clone</code>, your project avoids surprising performance footguns (this seems inarguably good).</li>
<li>In the next edition, when we no longer make <code>Copy</code> implicitly copy, you further avoid the footguns associated with that (also inarguably good).</li>
</ul>
<h3 id="is-this-revisiting-rfc-936">Is this revisiting <a href="https://github.com/rust-lang/rfcs/pull/936">RFC 936</a>?</h3>
<p>Ooh, deep cut! <a href="https://github.com/rust-lang/rfcs/pull/936">RFC 936</a> was a proposal to split <code>Pod</code> (memcopyable values) from <code>Copy</code> (implicitly memcopyable values). At the time, <a href="https://github.com/rust-lang/rfcs/pull/936#issuecomment-84036944">we decided not to do this</a>.<sup id="fnref:7"><a href="#fn:7" class="footnote-ref" role="doc-noteref">7</a></sup> I am even the one who <a href="https://github.com/rust-lang/rfcs/pull/936#issuecomment-78647601">summarized the reasons</a>. The short version is that we felt it better to have a single trait and lints.</p>
<p>I am definitely offering another alternative aiming at the same problem identified by the RFC. I don&rsquo;t think this means we made the wrong decision at the time. The problem was real, but the proposed solutions were not worth it. This proposal solves the same problems and more, and it has the benefit of ~10 years of experience.<sup id="fnref:8"><a href="#fn:8" class="footnote-ref" role="doc-noteref">8</a></sup> (Also, it&rsquo;s worth pointing out that this RFC came two months before 1.0, and I <em>definitely</em> feel to avoid derailing 1.0 with last minute changes &ndash; stability without stagnation!)</p>
<h3 id="doesnt-having-these-profile-lints-split-rust">Doesn&rsquo;t having these &ldquo;profile lints&rdquo; split Rust?</h3>
<p>A good question. Certainly on a technical level, there is nothing new here. We&rsquo;ve had lints since forever, and we&rsquo;ve seen that many projects use them in different ways (e.g., customized clippy levels or even &ndash; like the linux kernel &ndash; a <a href="https://github.com/Rust-for-Linux/klint">dedicated custom linter</a>). An important invariant is that lints define &ldquo;subsets&rdquo; of Rust, they don&rsquo;t change it. <strong>Any given piece of code that compiles always means the same thing.</strong></p>
<p>That said, the <a href="https://smallcultfollowing.com/babysteps/blog/2023/09/30/profiles/">profile pattern</a> <em>does</em> lower the cost to adding syntactic sugar, and I see a &ldquo;slippery slope&rdquo; here. I don&rsquo;t want Rust to fundamentally change its character. We should still be aiming at our core constituency of programs that prioritize performance, reliability, and long-term maintenance.</p>
<h3 id="how-will-we-judge-when-an-ergonomic-change-is-worth-it">How will we judge when an ergonomic change is &ldquo;worth it&rdquo;?</h3>
<p>I think we should write up some design axioms. But it turns out we already have a first draft! Some years back Aaron Turon wrote an astute analysis in the <a href="https://blog.rust-lang.org/2017/03/02/lang-ergonomics.html#how-to-analyze-and-manage-the-reasoning-footprint">&ldquo;ergonomics initiative&rdquo; blog post</a>. He identified three axes to consider:</p>
<blockquote>
<ul>
<li><strong>Applicability</strong>. Where are you allowed to elide implied information? Is there any heads-up that this might be happening?</li>
<li><strong>Power</strong>. What influence does the elided information have? Can it radically change program behavior or its types?</li>
<li><strong>Context-dependence</strong>. How much of do you have to know about the rest of the code to know what is being implied, i.e. how elided details will be filled in? Is there always a clear place to look?</li>
</ul>
</blockquote>
<p>Aaron concluded that <em>&quot;<strong>implicit features should balance these three dimensions</strong>. If a feature is large in one of the dimensions, it&rsquo;s best to strongly limit it in the other two.&quot;</em> In the case of autoclaim, the applicability is high (could happen a lot with no heads up) and the context dependence is medium-to-large (you have to know the types of things and traits they implement). We should therefore limit power, and this is why we put clear guidelines on who should implement <code>Claim</code>. And of course for the cases where that doesn&rsquo;t suffice, the lint can limit the applicability to zero.</p>
<p>I like this analysis. I also want us to consider &ldquo;who will want to opt-out and why&rdquo; and see if there are simple steps (e.g., ruling out allocation) we can take which will minimize that while retaining the feature&rsquo;s overall usefulness.</p>
<h3 id="what-about-explicit-closure-autoclaim-syntax">What about explicit closure autoclaim syntax?</h3>
<p>In a recent lang team meeting Josh raised the idea of annotating closures (and presumably async blocks) with some form of syntax that means &ldquo;they will auto-capture things they capture&rdquo;. I find the concept appealing because I like having an explicit version of automatic syntax; also, projects that deny <code>automatic_claim</code> should have a lightweight alternative for cases where they want to be more explicit. However, I&rsquo;ve not seen any actual specific <em>proposal</em> and I can&rsquo;t think of one myself that seems to carry its weight. So I guess I&rsquo;d say &ldquo;sure, I like it, but I would want it in addition to what is in this blog post, not instead of&rdquo;.</p>
<h3 id="what-about-explicit-closure-capture-clauses">What about explicit closure <em>capture clauses</em>?</h3>
<p>Ah, good question! It&rsquo;s almost like you read my mind! I was going to add to the previous question that I <em>do</em> like the idea of having some syntax for &ldquo;explicit capture clauses&rdquo; on closures.</p>
<p>Today, we just have <code>|| $body</code> (which implicitly captures paths in <code>$body</code> in some mode) and <code>move || $body</code> (which implicitly captures paths in <code>$body</code> by value).</p>
<p>Some years ago I wrote a <a href="https://hackmd.io/@nikomatsakis/SyI0eMFXO?type=view">draft RFC in a hackmd</a> that I still mostly like (I&rsquo;d want to revisit the details). The idea was to expand <code>move</code> to let it be more explicit about what is captured. So <code>move(a, b) || $body</code> would capture <em>only</em> <code>a</code> and <code>b</code> by value (and error if <code>$body</code> references other variables). But <code>move(&amp;a, b) || $body</code> would capture <code>a = &amp;a</code>. And <code>move(a.claim(), b) || $body</code> would capture <code>a = a.claim()</code>.</p>
<p>This is really attacking a different problem, the fact that closure captures have no explicit form, but it also gives a canonical, lighterweight pattern for &ldquo;claiming&rdquo; values from the surrounding context.</p>
<h3 id="how-did-you-come-up-with-the-name-claim">How did you come up with the name <code>Claim</code>?</h3>
<p>I <em>thought</em> <a href="https://github.com/jkelleyrtp/">Jonathan Kelley</a> suggested it to me, but reviewing my notes I see he suggested <code>Capture</code>. Well, that&rsquo;s a good name too. Maybe even a better one! I&rsquo;ve already written this whole damn blog post using the name <code>Claim</code>, so I&rsquo;m not going to go change it now. But I&rsquo;d expect a proper bikeshed before taking any real action.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>I love Wikipedia (of course), but using the name <a href="https://en.wikipedia.org/wiki/Passive_data_structure"><em>passive data structure</em></a> (which I have never heard before) instead of <em>plain old data</em> feels very&hellip; well, very <em>Wikipedia</em>.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>In point of fact, I would prefer if we could define the <code>claim</code> method as &ldquo;final&rdquo;, meaning that it cannot be overridden by implementations, so that we would have a guarantee that <code>x.claim()</code> and <code>x.clone()</code> are identical. You can do this somewhat awkwardly by defining <code>claim</code> in an extension trait, <a href="https://play.rust-lang.org/?version=stable&amp;mode=debug&amp;edition=2021&amp;gist=0eef90f677dc2013e73e6af80a2f7b35">like so</a>, but it&rsquo;d be a bit embarassing to have that in the standard library.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Interestingly, when I read that snippet, I had a moment where I thought &ldquo;maybe it should be <code>async move { do_something(cx.io.claim(), ...) }</code>?&rdquo;. But of course that won&rsquo;t work, that would be doing the claim <em>in</em> the future, whereas we want to do it <em>before</em>. But it really looks like it should work, and it&rsquo;s good evidence for how non-obvious this can be.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>In effect I am proposing to revisit the decision we made in <a href="https://github.com/rust-lang/rfcs/pull/936#issuecomment-78647601">RFC 936</a>, way back when. Actually, I have more thoughts on this, I&rsquo;ll leave them to a FAQ!&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>Oooh, that gives me an idea. It would be nice if in addition to writing <code>x.claim()</code> one could write <code>x.copy()</code> (similar to <a href="https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.copied"><code>iter.copied()</code></a>) to explicitly indicate that you are doing a memcpy. Then the compiler rule is basicaly that it will insert either <code>x.claim()</code> or <code>x.copy()</code> as appropriate for types that implement <code>Claim</code>.&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p>I&rsquo;ve noticed I&rsquo;m often more willing to revisit long-standing design decisions than others I talk to. I think it comes from having been present when the decisions were made. I know most of them were close calls and often began with &ldquo;let&rsquo;s try this for a while and see how it feels&hellip;&rdquo;. Well, I think it comes from that <em>and</em> a certain predilection for recklessness. 🤘&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:7">
<p>This RFC is so old it predates <a href="https://github.com/rust-lang/rfcbot-rs">rfcbot</a>! Look how informal that comment was. Astounding.&#160;<a href="#fnref:7" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:8">
<p>This seems to reflect the best and worst of Rust decision making. The best because autoclaim represents (to my mind) a nice &ldquo;third way&rdquo; in between two extreme alternatives. The worst because the rough design for autoclaim has been clear for years but it sometimes takes a long time for us to actually act on things. Perhaps that&rsquo;s just the nature of the beast, though.&#160;<a href="#fnref:8" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/claim" term="claim" label="Claim"/><category scheme="https://smallcultfollowing.com/babysteps/series/ergonomic-rc" term="ergonomic-rc" label="Ergonomic RC"/></entry><entry><title type="html">The borrow checker within</title><link href="https://smallcultfollowing.com/babysteps/blog/2024/06/02/the-borrow-checker-within/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2024/06/02/the-borrow-checker-within/</id><published>2024-06-02T00:00:00+00:00</published><updated>2024-06-02T08:33:48-04:00</updated><content type="html"><![CDATA[<p>This post lays out a 4-part roadmap for the borrow checker that I call &ldquo;the borrow checker within&rdquo;. These changes are meant to help Rust become a better version of itself, enabling patterns of code which feel like they fit within Rust&rsquo;s <em>spirit</em>, but run afoul of the letter of its <em>law</em>. I feel fairly comfortable with the design for each of these items, though work remains to scope out the details. My belief is that a-mir-formality will make a perfect place to do that work.</p>
<h2 id="rusts-spirit-is-mutation-xor-sharing">Rust&rsquo;s <em>spirit</em> is <em>mutation xor sharing</em></h2>
<p>When I refer to the <em>spirit</em> of the borrow checker, I mean the rules of <em>mutation xor sharing</em> that I see as Rust&rsquo;s core design ethos. This basic rule—that when you are mutating a value using the variable <code>x</code>, you should not also be reading that data through a variable <code>y</code>—is what enables Rust&rsquo;s memory safety guarantees and also, I think, contributes to its overall sense of &ldquo;if it compiles, it works&rdquo;.</p>
<p><em>Mutation xor sharing</em> is, in some sense, neither necessary nor sufficient. It&rsquo;s not <em>necessary</em> because there are many programs (like every program written in Java) that share data like crazy and yet still work fine<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>. It&rsquo;s also not <em>sufficient</em> in that there are many problems that demand some amount of sharing &ndash; which is why Rust has &ldquo;backdoors&rdquo; like <code>Arc&lt;Mutex&lt;T&gt;&gt;</code>, <code>AtomicU32</code>, and—the ultimate backdoor of them all—<code>unsafe</code>.</p>
<p>But to me the biggest surprise from working on Rust is how often this <em>mutation xor sharing</em> pattern is &ldquo;just right&rdquo;, once you learn how to work with it<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>. The other surprise has been seeing the benefits over time: programs written in this style are fundamentally &ldquo;less surprising&rdquo; which, in turn, means they are more maintainable over time.</p>
<p>In Rust today though there are a number of patterns that are rejected by the borrow checker despite fitting the <em>mutation xor sharing</em> pattern. Chipping away at this gap, helping to make the borrow checker&rsquo;s rules a more perfect reflection of <em>mutation xor sharing</em>, is what I mean by <em>the borrow checker within</em>.</p>
<blockquote>
<p>I saw the angel in the marble and carved until I set him free. — Michelangelo</p>
</blockquote>
<h2 id="ok-enough-inspirational-rhetoric-lets-get-to-the-code">OK, enough inspirational rhetoric, let&rsquo;s get to the code.</h2>
<p>Ahem, right. Let&rsquo;s do that.</p>
<h2 id="step-1-conditionally-return-references-easily-with-polonius">Step 1: Conditionally return references easily with “Polonius”</h2>
<p>Rust 2018 introduced <a href="https://rust-lang.github.io/rfcs/2094-nll.html">“non-lexical lifetimes”</a> — this rather cryptic name refers to an extension of the borrow checker so that it understood the control flow within functions much more deeply. This change made using Rust a much more “fluid” experience, since the borrow checker was able to accept a lot more code.</p>
<p>But NLL does not handle one important case<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>: conditionally returning references. Here is the canonical example, taken from Remy&rsquo;s <a href="https://blog.rust-lang.org/inside-rust/2023/10/06/polonius-update.html">Polonius update blog post</a>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">get_default</span><span class="o">&lt;</span><span class="na">&#39;r</span><span class="p">,</span><span class="w"> </span><span class="n">K</span>: <span class="nc">Hash</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Eq</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Copy</span><span class="p">,</span><span class="w"> </span><span class="n">V</span>: <span class="nb">Default</span><span class="o">&gt;</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">map</span>: <span class="kp">&amp;</span><span class="na">&#39;r</span> <span class="nc">mut</span><span class="w"> </span><span class="n">HashMap</span><span class="o">&lt;</span><span class="n">K</span><span class="p">,</span><span class="w"> </span><span class="n">V</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">key</span>: <span class="nc">K</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">)</span><span class="w"> </span>-&gt; <span class="kp">&amp;</span><span class="na">&#39;r</span> <span class="nc">mut</span><span class="w"> </span><span class="n">V</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">match</span><span class="w"> </span><span class="n">map</span><span class="p">.</span><span class="n">get_mut</span><span class="p">(</span><span class="o">&amp;</span><span class="n">key</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nb">Some</span><span class="p">(</span><span class="n">value</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">value</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nb">None</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">map</span><span class="p">.</span><span class="n">insert</span><span class="p">(</span><span class="n">key</span><span class="p">,</span><span class="w"> </span><span class="n">V</span>::<span class="n">default</span><span class="p">());</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//  ------ 💥 Gets an error today,
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//            but not with polonius
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">map</span><span class="p">.</span><span class="n">get_mut</span><span class="p">(</span><span class="o">&amp;</span><span class="n">key</span><span class="p">).</span><span class="n">unwrap</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">  
</span></span></span></code></pre></div><p><a href="https://blog.rust-lang.org/inside-rust/2023/10/06/polonius-update.html">Remy’s post</a> gives more details about why this occurs and how we plan to fix it. It&rsquo;s mostly accurate except that the timeline has  stretched on more than I’d like (of course). But we are making steady progress these days.</p>
<h2 id="step-2-a-syntax-for-lifetimes-based-on-places">Step 2: A syntax for lifetimes based on places</h2>
<p>The next step is to add an explicit syntax for lifetimes based on “place expressions” (e.g., <code>x</code> or <code>x.y</code>). I wrote about this in my post <a href="https://smallcultfollowing.com/babysteps/blog/2024/03/04/borrow-checking-without-lifetimes/">Borrow checking without lifetimes</a>. This is basically taking the formulation that underlies Polonius and adding a syntax.</p>
<p>The idea would be that, in addition to the abstract lifetime parameters we have today, you could reference program variables and even fields as the “lifetime” of a reference. So you could write <code>’x</code> to indicate a value that is “borrowed from the variable <code>x</code>”. You could also write <code>’x.y</code> to indicate that it was borrowed from the field <code>y</code> of <code>x</code>, and even <code>'(x.y, z)</code> to mean borrowed from <em>either</em> <code>x.y</code> or <code>z</code>. For example:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">WidgetFactory</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">manufacturer</span>: <span class="nb">String</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">model</span>: <span class="nb">String</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">WidgetFactory</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">new_widget</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">name</span>: <span class="nb">String</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Widget</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">name_suffix</span>: <span class="kp">&amp;</span><span class="err">’</span><span class="n">name</span><span class="w"> </span><span class="kt">str</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">name</span><span class="p">[</span><span class="mi">3</span><span class="o">..</span><span class="p">];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                       </span><span class="c1">// ——- borrowed from “name”
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">model_prefix</span>: <span class="kp">&amp;</span><span class="err">’</span><span class="bp">self</span><span class="p">.</span><span class="n">model</span><span class="w"> </span><span class="kt">str</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="bp">self</span><span class="p">.</span><span class="n">model</span><span class="p">[</span><span class="o">..</span><span class="mi">2</span><span class="p">];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                         </span><span class="c1">// —————- borrowed from “self.model”
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This would make many of lifetime parameters we write today unnecessary. For example, the classic Polonius example where the function takes a parameter <code>map: &amp;mut Hashmap&lt;K, V&gt;</code> and returns a reference into the map can be written as follows:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">get_default</span><span class="o">&lt;</span><span class="n">K</span>: <span class="nc">Hash</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Eq</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Copy</span><span class="p">,</span><span class="w"> </span><span class="n">V</span>: <span class="nb">Default</span><span class="o">&gt;</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">map</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">HashMap</span><span class="o">&lt;</span><span class="n">K</span><span class="p">,</span><span class="w"> </span><span class="n">V</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">key</span>: <span class="nc">K</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">)</span><span class="w"> </span>-&gt; <span class="kp">&amp;</span><span class="na">&#39;map</span> <span class="nc">mut</span><span class="w"> </span><span class="n">V</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//---- &#34;borrowed from the parameter map&#34;
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This syntax is more convenient — but I think its bigger impact will be to make Rust more teachable and learnable. Right now, lifetimes are in a tricky place, because</p>
<ul>
<li>they represent a concept (spans of code) that isn’t normal for users to think explicitly about and</li>
<li>they don’t have any kind of syntax.</li>
</ul>
<p>Syntax is useful when learning because it allows you to make everything explicit, which is a critical intermediate step to really internalizing a concept — what boats memorably called the <a href="https://github.com/rust-lang/rfcs/pull/2071#issuecomment-329026602">dialectical ratchet</a>. Anecdotally I’ve been using a “place-based” syntax when teaching people Rust and I’ve found it is much quicker for them to grasp it.</p>
<h2 id="step-3-view-types-and-interprocedural-borrows">Step 3: View types and interprocedural borrows</h2>
<p>The next piece of the plan is <a href="https://smallcultfollowing.com/babysteps/blog/2021/11/05/view-types/">view types</a>, which are a way to have functions declare which fields they access. Consider a struct like <code>WidgetFactory</code>&hellip;</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">WidgetFactory</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">counter</span>: <span class="kt">usize</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">widgets</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">Widget</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>&hellip;which has a helper function <code>increment_counter</code>&hellip;</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">WidgetFactory</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">increment_counter</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">counter</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Today, if we want to iterate over the widgets and occasionally increment the counter with <code>increment_counter</code>, <a href="https://play.rust-lang.org/?version=stable&amp;mode=debug&amp;edition=2021&amp;gist=afeb1a8021ab1abf73639ffea0bbcae3">we will encounter an error</a>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">WidgetFactory</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">increment_counter</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="o">..</span><span class="p">.}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">count_widgets</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">for</span><span class="w"> </span><span class="n">widget</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="o">&amp;</span><span class="bp">self</span><span class="p">.</span><span class="n">widgets</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">if</span><span class="w"> </span><span class="n">widget</span><span class="p">.</span><span class="n">should_be_counted</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="bp">self</span><span class="p">.</span><span class="n">increment_counter</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="c1">// ^ 💥 Can&#39;t borrow self as mutable
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="c1">//      while iterating over `self.widgets`
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The problem is that the borrow checker operates one function at a time. It doesn&rsquo;t know precisely which fields <code>increment_counter</code> is going to mutate. So it conservatively assumes that <code>self.widgets</code> may be changed, and that&rsquo;s not allowed. There are a number of workarounds today, such as writing a &ldquo;free function&rdquo; that doesn&rsquo;t take <code>&amp;mut self</code> but rather takes references to the individual fields (e.g., <code>counter: &amp;mut usize</code>) or even collecting those references into a &ldquo;view struct&rdquo; (e.g., <code>struct WidgetFactoryView&lt;'a&gt; { widgets: &amp;'a [Widget], counter: &amp;'a mut usize }</code>) but these are non-obvious, annoying, and non-local (they require changing significant parts of your code)</p>
<p><a href="https://smallcultfollowing.com/babysteps/blog/2021/11/05/view-types/">View types</a> extend struct types so that instead of just having a type like <code>WidgetFactory</code>, you can have a &ldquo;view&rdquo; on that type that included only a subset of the fields, like <code>{counter} WidgetFactory</code>. We can use this to modify <code>increment_counter</code> so that it declares that it will only access the field <code>counter</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">WidgetFactory</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">increment_counter</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="p">{</span><span class="n">counter</span><span class="p">}</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//               -------------------
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// Equivalent to `self: &amp;mut {counter} WidgetFactory`
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">counter</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This allows the compiler to compile <code>count_widgets</code> just fine, since it can see that iterating over <code>self.widgets</code> while modifying <code>self.counter</code> is not a problem.<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup></p>
<h3 id="view-types-also-address-phased-initialization">View types also address phased initialization</h3>
<p>There is another place where the borrow checker&rsquo;s rules fall short: <em>phased initialization</em>. Rust today follows the functional programming language style of requiring values for all the fields of a struct when it is created. Mostly this is fine, but sometimes you have structs where you want to initialize some of the fields and then invoke helper functions, much like <code>increment_counter</code>, to create the remainder. In this scenario you are stuck, because those helper functions cannot take a reference to the struct since you haven&rsquo;t created the struct yet. The workarounds (free functions, intermediate struct types) are very similar.</p>
<h3 id="start-with-private-functions-consider-scaling-to-public-functions">Start with private functions, consider scaling to public functions</h3>
<p>View types as described here have limitations. Because the types involve the names of fields, they are not really suitable for public interfaces. They could also be annoying to use in practice because one will have sets of fields that go together that have to be manually copied and pasted. All of this is true but I think something that can be addressed later (e.g., with named groups of fields).</p>
<p>What I&rsquo;ve found is that the majority of times that I want to use view types, it is in <em>private</em> functions. Private methods often do little bits of logic and make use of the struct&rsquo;s internal structure. Public methods in contrast tend to do larger operations and to hide that internal structure from users. This isn&rsquo;t a universal law &ndash; sometimes I have public functions that should be callable concurrently &ndash; but it happens less.</p>
<p>There is also an advantage to the current behavior for public functions in particular: it preserves forward compatibilty. Taking <code>&amp;mut self</code> (versus some subset of fields) means that the function can change the set of fields that it uses without affecting its clients. This is not a concern for private functions.</p>
<h2 id="step-4-internal-references">Step 4: Internal references</h2>
<p>Rust today cannot support structs whose fields refer to data owned by another. This gap is partially closed through crates like <a href="https://crates.io/crates/rental">rental</a> (no longer maintained), though more often by <a href="https://smallcultfollowing.com/babysteps/blog/2015/04/06/modeling-graphs-in-rust-using-vector-indices/">modeling internal references with indices</a>. We also have <code>Pin</code>, which covers the related (but even harder) problem of immobile data.</p>
<p>I&rsquo;ve been chipping away at a solution to this problem for some time. I won&rsquo;t be able to lay it out in full in this post, but I can sketch what I have in mind, and lay out more details in future posts (I have done some formalization of this, enough to convince myself it works).</p>
<p>As an example, imagine that we have some kind of <code>Message</code> struct consisting of a big string along with several references into that string. You could model that like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Message</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">text</span>: <span class="nb">String</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">headers</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="p">(</span><span class="o">&amp;</span><span class="na">&#39;self</span><span class="p">.</span><span class="n">text</span><span class="w"> </span><span class="kt">str</span><span class="p">,</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;self</span><span class="p">.</span><span class="n">text</span><span class="w"> </span><span class="kt">str</span><span class="p">)</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">body</span>: <span class="kp">&amp;</span><span class="na">&#39;self</span><span class="p">.</span><span class="n">text</span><span class="w"> </span><span class="kt">str</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This message would be constructed in the usual way:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">text</span>: <span class="nb">String</span> <span class="o">=</span><span class="w"> </span><span class="n">parse_text</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="p">(</span><span class="n">headers</span><span class="p">,</span><span class="w"> </span><span class="n">body</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">parse_message</span><span class="p">(</span><span class="o">&amp;</span><span class="n">text</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">message</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Message</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">text</span><span class="p">,</span><span class="w"> </span><span class="n">headers</span><span class="p">,</span><span class="w"> </span><span class="n">body</span><span class="w"> </span><span class="p">};</span><span class="w">
</span></span></span></code></pre></div><p>where <code>parse_message</code> is some function like</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">parse_message</span><span class="p">(</span><span class="n">text</span>: <span class="kp">&amp;</span><span class="kt">str</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nb">Vec</span><span class="o">&lt;</span><span class="p">(</span><span class="o">&amp;</span><span class="na">&#39;text</span><span class="w"> </span><span class="kt">str</span><span class="p">,</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;text</span><span class="w"> </span><span class="kt">str</span><span class="p">)</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">&amp;</span><span class="na">&#39;text</span><span class="w"> </span><span class="kt">str</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">headers</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// ...
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">(</span><span class="n">headers</span><span class="p">,</span><span class="w"> </span><span class="n">body</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Note that <code>Message</code> doesn&rsquo;t have any lifetime parameters &ndash; it doesn&rsquo;t need any, because it doesn&rsquo;t borrow from anything outside of itself. In fact, <code>Message: 'static</code> is true, which means that I could send this <code>Message</code> to another thread:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// A channel of `Message` values:
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="p">(</span><span class="n">tx</span><span class="p">,</span><span class="w"> </span><span class="n">rx</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">std</span>::<span class="n">sync</span>::<span class="n">mpsc</span>::<span class="n">channel</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// A thread to consume those values:
</span></span></span><span class="line"><span class="cl"><span class="n">std</span>::<span class="n">thread</span>::<span class="n">spawn</span><span class="p">(</span><span class="k">move</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="w"> </span><span class="n">message</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="n">rx</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// `message` here has type `Message`
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">process</span><span class="p">(</span><span class="n">message</span><span class="p">.</span><span class="n">body</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// Produce them:
</span></span></span><span class="line"><span class="cl"><span class="k">loop</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">message</span>: <span class="nc">Message</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">next_message</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">tx</span><span class="p">.</span><span class="n">send</span><span class="p">(</span><span class="n">message</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="how-far-along-are-each-of-these-ideas">How far along are each of these ideas?</h2>
<p>Roughly speaking&hellip;</p>
<ul>
<li>Polonius &ndash; &lsquo;just&rsquo; engineering</li>
<li>Syntax &ndash; &lsquo;just&rsquo; bikeshedding</li>
<li>View types &ndash; needs modeling, one or two open questions in my mind<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup></li>
<li>Internal references &ndash; modeled in some detail for a simplified variant of Rust, have to port to Rust and explain the assumptions I made along the way<sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup></li>
</ul>
<p>&hellip;in other words, I&rsquo;ve done enough work to to convince myself that these designs are practical, but plenty of work remains. :)</p>
<h2 id="how-do-we-prioritize-this-work">How do we prioritize this work?</h2>
<p>Whenever I think about investing in borrow checker ergonomics and usability, I feel a bit guilty. Surely something so fun to think about must be a bad use of my time.</p>
<p>Conversations at RustNL shifted my perspective. When I asked people about pain points, I kept hearing the same few themes arise, especially from people trying building applications or GUIs.</p>
<p>I now think I had fallen victim to the dreaded “curse of knowledge”, forgetting how frustrating it can be to run into a limitation of the borrow checker and not know how to resolve it.</p>
<h2 id="conclusion">Conclusion</h2>
<p>This post proposes four changes attacking some very long-standing problems:</p>
<ul>
<li><strong>Conditionally returned references</strong>, solved by <a href="https://blog.rust-lang.org/inside-rust/2023/10/06/polonius-update.html">Polonius</a></li>
<li><strong>No or awkward syntax for lifetimes</strong>, solved by an <a href="https://smallcultfollowing.com/babysteps/blog/2024/03/04/borrow-checking-without-lifetimes/">explicit lifetime syntax</a></li>
<li><strong>Helper methods whose body must be inlined</strong>, solved by <a href="https://smallcultfollowing.com/babysteps/blog/2021/11/05/view-types/">view types</a></li>
<li><strong>Can&rsquo;t &ldquo;package up&rdquo; a value and references into that value</strong>, solved by interior references</li>
</ul>
<p>You may have noticed that these changes build on one another. Polonius remodels borrowing in terms of &ldquo;place expressions&rdquo; (variables, fields). This enables an explicit lifetime syntax, which in turn is a key building block for interior references. View types in turn let us expose helper methods that can operate on &lsquo;partially borrowed&rsquo; (or even partially initialized!) values.</p>
<h3 id="why-these-changes-wont-make-rust-more-complex-or-if-they-do-its-worth-it">Why these changes won&rsquo;t make Rust &ldquo;more complex&rdquo; (or, if they do, it&rsquo;s worth it)</h3>
<p>You might wonder about the impact of these changes on Rust&rsquo;s complexity. Certainly they grow the set of things the type system can express. But in my mind they, like <a href="https://rust-lang.github.io/rfcs/2094-nll.html">NLL</a> before them, fall into that category of changes that will actually make using Rust feel <em>simpler</em> overall.</p>
<p>To see why, put yourself in the shoes of a user today who has written any one of the &ldquo;obviously correct&rdquo; programs we&rsquo;ve seen in this post &ndash; for example, <a href="https://play.rust-lang.org/?version=stable&amp;mode=debug&amp;edition=2021&amp;gist=c9f5902084a631a8af5b769c094b69b6">the <code>WidgetFactory</code> code we saw in view types</a>. Compiling this code today gives an error:</p>
<pre tabindex="0"><code>error[E0502]: cannot borrow `*self` as mutable
              because it is also borrowed as immutable
  --&gt; src/lib.rs:14:17
   |
12 | for widget in &amp;self.widgets {
   |               -------------
   |               |
   |               immutable borrow occurs here
   |               immutable borrow later used here
13 |     if widget.should_be_counted() {
14 |         self.increment_counter();
   |         ^^^^^^^^^^^^^^^^^^^^^^^^
   |         |
   |         mutable borrow occurs here
</code></pre><p>Despite all our efforts to render it well, this error is <strong>inherently confusing</strong>. It is not possible to explain why <code>WidgetFactory</code> doesn&rsquo;t work from an &ldquo;intuitive&rdquo; point-of-view because <strong>conceptually it <em>ought</em> to work</strong>, it just runs up against a limit of our type system.</p>
<p>The only way to understand why <code>WidgetFactory</code> doesn&rsquo;t compile is to dive deeper into the engineering details of how the Rust type system functions, and that is precisely the kind of thing people <em>don&rsquo;t</em> want to learn. Moreover, once you&rsquo;ve done that deep dive, what is your reward? At best you can devise an awkward workaround. Yay 🥳.<sup id="fnref:7"><a href="#fn:7" class="footnote-ref" role="doc-noteref">7</a></sup></p>
<p>Now imagine what happens with view types. You still get an error, but now that error can come with a suggestion:</p>
<pre tabindex="0"><code>help: consider declaring the fields
      accessed by `increment_counter` so that
      other functions can rely on that
 7 | fn increment_counter(&amp;mut self) {
   |                      ---------
   |                      |
   |      help: annotate with accessed fields: `&amp;mut {counter} self`
</code></pre><p>You now have two choices. First, you can apply the suggestion and move on &ndash; your code works! Next, at your leisure, you can dig in a bit deeper and understand what&rsquo;s going on. You can learn about the semver hazards that motivate an explicit declaration here.</p>
<p>Yes, you&rsquo;ve learned a new detail of the type system, but you did so <strong>on your schedule</strong> and, where extra annotations were required, they were well-motivated. Yay 🥳!<sup id="fnref:8"><a href="#fn:8" class="footnote-ref" role="doc-noteref">8</a></sup></p>
<h3 id="reifying-the-borrow-checker-into-types">Reifying the borrow checker into types</h3>
<p>There is another theme running through here: moving the borrow checker analysis out from the compiler&rsquo;s mind and into types that can be expressed. Right now, all types always represent fully initialized, unborrowed values. There is no way to express a type that captures the state of being in the midst of iterating over something or having moved one or two fields but not all of them. These changes address that gap.<sup id="fnref:9"><a href="#fn:9" class="footnote-ref" role="doc-noteref">9</a></sup></p>
<h3 id="this-conclusion-is-too-long">This conclusion is too long</h3>
<p>I know, I&rsquo;m like Peter Jackson trying to end &ldquo;The Return of the King&rdquo;, I just can&rsquo;t do it! I keep coming up with more things to say. Well, I&rsquo;ll stop now. Have a nice weekend y&rsquo;all.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Well, every program written in Java <em>does</em> share data like crazy, but they do not all work fine. But you get what I mean.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>And I think learning how to work with <em>mutation xor sharing</em> is a big part of what it means to learn Rust.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>NLL as implemented, anyway. The original design was meant to cover conditionally returning references, but the proposed type system was not feasible to implement. Moreover, and I say this as the one who designed it, the formulation in the NLL RFC was not good. It was mind-bending and hard to comprehend. Polonius is much better.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>In fact, view types will also allow us to implement the &ldquo;disjoint closure capture&rdquo; rules from <a href="https://rust-lang.github.io/rfcs/2229-capture-disjoint-fields.html">RFC 2229</a> in a more efficient way. Currently a closure using <code>self.widgets</code> and <code>self.counter</code> will store 2 references, kind of an implicit &ldquo;view struct&rdquo;. Although <a href="https://rust-lang.zulipchat.com/#narrow/stream/189812-t-compiler.2Fwg-rfc-2229/topic/measure.20closure.20sizes">we found this doesn&rsquo;t really affect much code in practice</a>, it still bothers me. With view types they could store 1.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>To me, the biggest open question for view types is how to accommodate &ldquo;strong updates&rdquo; to types. I&rsquo;d like to be able to do <code>let mut wf: {} WidgetFactory = WidgetFactory {}</code> to create a <code>WidgetFactory</code> value that is completely uninitialized and then permit writing (for example) <code>wf.counter = 0</code>. This should update the type of <code>wf</code> to <code>{counter} WidgetFactory</code>. Basically I want to link the information found in types with the borrow checker&rsquo;s notion of what is initialized, but I haven&rsquo;t worked that out in detail.&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p>As an example, to make this work I&rsquo;m assuming some kind of &ldquo;true deref&rdquo; trait that indicates that <code>Deref</code> yields a reference that remains valid even as the value being deref&rsquo;d moves from place to place. We need a trait much like this for other reasons too.&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:7">
<p>That&rsquo;s a sarcastic &ldquo;Yay 🥳&rdquo;, in case you couldn&rsquo;t tell.&#160;<a href="#fnref:7" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:8">
<p>This &ldquo;Yay 🥳&rdquo; is genuine.&#160;<a href="#fnref:8" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:9">
<p>I remember years ago presenting Rust at some academic conference and a friendly professor telling me, &ldquo;In my experience, you always want to get that state into the type system&rdquo;. I think that professor was right, though I don&rsquo;t regret not prioritizing it (always a million things to do, better to ask what is the right next step <em>now</em> than to worry about what step might&rsquo;ve been better in the past). Anyway, I wish I could remember <em>who</em> that was!&#160;<a href="#fnref:9" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/pinned/yes" term="yes" label="yes"/></entry><entry><title type="html">Unwind considered harmful?</title><link href="https://smallcultfollowing.com/babysteps/blog/2024/05/02/unwind-considered-harmful/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2024/05/02/unwind-considered-harmful/</id><published>2024-05-02T00:00:00+00:00</published><updated>2024-05-02T12:39:00-04:00</updated><content type="html"><![CDATA[<p>I’ve been thinking a wild thought lately: we should deprecate <code>panic=unwind</code>. Most production users I know either already run with <code>panic=abort</code> or use unwinding in a very limited fashion, basically just to run to <em>cleanup</em>, not to truly <em>recover</em>. Removing unwinding from most case meanwhile has a number of benefits, allowing us to extend the type system in interesting and potentially very impactful ways. It also removes a common source of subtle bugs. Note that I am not saying we should remove unwinding entirely: that’s not an option, both because of stability and because of Rust’s mission to “deeply integrate” with all kinds of languages and systems.</p>
<h2 id="unwinding-means-all-code-must-be-able-to-stop-at-every-point">Unwinding means all code must be able to stop at every point</h2>
<p>Unwinding puts a “non-local burden” on the language. The fundamental premise of unwinding is that it should be possible for all code to just <strong>stop</strong> execution at any point (or at least at any function call) and then be restarted. <strong>But this is not always possible</strong>. Sometimes code disturbs invariants which must be restored before execution can continue in a reasonable way.</p>
<h2 id="the-impact-of-unwinding-was-supposed-to-be-contained">The impact of unwinding was supposed to be contained</h2>
<p>In Graydon’s initial sketches for Rust’s design, he was very suspicious of unwinding.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> Unwinding introduces implicit control flow that is difficult to reason about. Worse, this control flow doesn’t surface during “normal execution”, it only shows up when things go wrong — this can tend to pile up, making a bad situation worse.</p>
<p>The initial idea was that unwinding would be allowed, but it would always unwinding the entire active thread. Moreover, since in very early Rust threads couldn’t share state at all (it was more like Erlang), that limited the damage that a thread could do. It was reasonable to assume that programs could recover.</p>
<h2 id="but-it-escaped-its-bounds">But it escaped its bounds</h2>
<p>Over time, both of the invariants that limited unwinding’s scope proved untenable. Most importantly, we added shared-mutability with types like <code>Mutex</code>. This was necessary to cover the full range of use cases Rust aims to cover, but it meant that it was now possible for threads to leave data in a disturbed state. We added “lock poisoning” to account for that, but it’s an ergonomic annoyance and an imperfect solution, and so libraries like <code>parking_lot</code> have simply removed it.</p>
<p>We also added <code>catch_unwind</code>, allowing recovery within a thread. This was meant to be used in libraries like <code>rayon</code> that were simulating many logical threads with one OS thread, but it of course opened the door to “catching” exceptions in other scenarios. We added the idea of <code>UnwindSafe</code> to try and discourage abuse, but (in a familiar theme) it’s an ergonomic annoyance and an imperfect solution, and so many folks would prefer to just remove it.</p>
<h2 id="unwinding-increases-binary-size-and-reduces-optimization-potential">Unwinding increases binary size and reduces optimization potential</h2>
<p>Unwinding is supposed to be a “zero-cost abstraction”, but it’s not really. To start, it requires inserting “landing pads” — basically, the code that will execute when unwinding occurs — which can take up quite a large amount of space in your binary. Folks like Fuchsia have measured binary size improvements of up to 10% by removing unwinding. Second, the need to account for unwinding limits optimizations, because the compiler has to account for more control-flow paths. I don’t have a number for how high of an impact this is, but it’s clearly not zero.</p>
<h2 id="unwinding-puts-limits-on-the-borrow-checker">Unwinding puts limits on the borrow checker</h2>
<p>Accounting for unwinding also requires the borrow checker to be more conservative. Consider for example the function <code>std::mem::swap</code>. It’d be nice if one could write this in safe code:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">swap</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">a</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">T</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">b</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">T</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">tmp</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">*</span><span class="n">a</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">*</span><span class="n">a</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">*</span><span class="n">b</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">*</span><span class="n">b</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">tmp</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This code won’t compile today, because <code>let tmp = *a</code> requires moving out of <code>*a</code>, and <code>a</code> is an <code>&amp;mut</code> reference. That would leave the reference in an “incomplete” state, so we don’t allow it. But is that constraint truly needed? After all, the reference is going to be restored a few lines below…?</p>
<p>The reason the borrow checker does not accept code like the above is due to unwinding. In general, if you move out of an <code>&amp;mut</code>, you leave a hole behind that <strong>MUST</strong> be filled before the function returns. In the function above, it is in fact guaranteed that the hole will be filled before <code>swap</code> returns. But in general there is a very narrow range of code that can safely execute, since any function call (and many other operations besides) can initiate a <code>panic!</code>. And if unwinding occurred, then the code that restores the <code>&amp;mut</code> value would never execute. For this reason, we deemed it not worth the complexity to support moving out of <code>&amp;mut</code> references.</p>
<h2 id="unwinding-prevents-code-from-running-to-completion">Unwinding prevents code from running to completion</h2>
<p>If the only cost of unwinding was moving out of <code>&amp;mut</code>and inflated binary sizes, I would think that it’s probably worth it to keep it. But over time it’s become clear to me that this is just one special case of a more general challenge with unwinding, which is that functions simply cannot rely on running to completion. This creates challenges in a number of areas.</p>
<h3 id="unwinding-makes-unsafe-code-really-hard-to-write">Unwinding makes unsafe code really hard to write</h3>
<p>If you are writing unsafe code, you have to be very careful to account for possible unwinding. And it can occur in a lot of places! Some of them are obvious, such as when the user gives you a closure and you call it. Others are less obvious, such as when you call a trait method like <code>x.clone()</code> where <code>x</code> has some unknown type <code>T: Clone</code>. Others are downright obscure, such as when you execute <code>vec[i] = new_value</code> and <code>vec</code> is a <code>Vec&lt;T&gt;</code> for some unknown type <code>T</code> — that last one will run the destructor on <code>vec[i]</code> , which can panic, and hence can unwind (at least until <a href="https://github.com/rust-lang/rfcs/pull/3288">RFC #3288</a> is accepted). When developing Rayon, I found I could not feasibly track all the places that unwinding could occur, and thus gave up and just added <a href="https://github.com/rayon-rs/rayon/blob/0e8d45dd3e5b62a9ef86fdc754a9b9e3b4f048a8/rayon-core/src/unwind.rs#L24">code to abort if unwinding occurs when I don’t expect it</a>.</p>
<h3 id="unwinding-makes-must-move-types-untenable">Unwinding makes <a href="https://smallcultfollowing.com/babysteps/blog/2023/03/16/must-move-types/">Must Move types</a> untenable</h3>
<p>In a previous blog post I wrote about the idea of <a href="https://smallcultfollowing.com/babysteps/blog/2023/03/16/must-move-types/">must move types</a>. I am not sure if this idea is worth it on balance (although I think it might be, it addresses an awful lot of scenarios) but I think it will not be workable with unwinding. And the reason is the same as everything else: the point of a “must move” type is that it must be moved before the fn ends. This effectively means there is some kind of action you must take. But unwinding assumes you can stop the function at any point, so you can never guarantee that this action gets taken (at least, not in a practical sense, in principle you could setup destructors to take the action, but it would be unworkable I think).</p>
<h2 id="unwinding-is-of-course-useful">Unwinding is of course useful</h2>
<p>I’ve been dunking on unwinding, but it is of course useful (although I <em>suspect</em> less broadly than is commonly believed). The most obvious use case is recovering in an “event-driven” sort of process, like a webserver or perhaps a GUI. We’ve all been to websites that dump a stack trace on our screen. Unwinding is one way that you could implement this sort of recovery in Rust. It’s not, however, the <em>only</em> way. We could look into constructs that leverage process-based recovery, for example. And of course unwinding-based recovery is a bit risky, if there is shared state. Plus, in practice, a good many things that become exceptions in Java are <code>Result</code>-return values in Rust.</p>
<p>For me, the key thing here is that virtually every network service I know of ships either with panic=abort or without really leveraging unwinding to <em>recover</em>, just to take cleanup actions and then exit. This could be done with panic=abort and exit handlers.</p>
<p>One other place that uses unwinding is the salsa framework, which uses it to abort cancelled operations in IDEs. It’s useful there because all the code is side-effect free, so we really can unwinding without any impact. But we could always find another solution to the problem.</p>
<h2 id="unwinding-is-in-fact-requiredbut-only-in-narrow-places">Unwinding is in fact required…but only in narrow places</h2>
<p>I don’t really think Rust should remove support for unwinding, of course. For one thing, there is backwards compatibility to consider. But for another, I think that Rust ought to have the goal that it ultimately supports any low-level thing you might want to do. There are C++ systems that use exceptions, and Rust ought to interoperate with them. But I don’t think that means the default across all of Rust should be unwinding: it’s more like “something you need in a narrow part of your codebase so you can convert to <code>Result</code>”.</p>
<h2 id="conclusion">Conclusion</h2>
<p>I think the argument for deprecating unwinding boils down to this: unwinding purports to make cheap recovery tenable, but it’s not really reliable in the face of shared state. Meanwhile, it puts limits on what we can do in the language, ultimately decreasing reliability (because we can’t guarantee cleanup is done) and ease of use (borrow checker is stricter, APIs that would require cleanup can’t be written).</p>
<p>How could we deprecate it, though? It would basically become part of the ABI, much like C vs C-unwind. It’d be possible to opt-in on a finer-grained basis. In functions that are guaranteed not to have unwinding, the borrow checker could be more permissive, and must-move types could be supported.</p>
<p>I’m definitely tempted to sketch out what deprecating unwinding might look like in more detail. I’d be curious to hear from folks that rely on unwinding to better understand where it is useful— and if we can find alternatives that meet the need in a more narrowly tailored way!</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>For a time, we were exploring an alternative approach to panics called <em>signals</em> that didn&rsquo;t use unwinding at all &ndash; the idea was that, for each error condition, you would expose a hook point (a &ldquo;signal&rdquo;) that users could customize to control what to do in the case of error. This proved a bit too unfamiliar and kind of a pain in practice, and we wound up backing away from it. Today&rsquo;s <a href="https://doc.rust-lang.org/std/panic/fn.set_hook.html">panic hook</a> is sort of a simpler version of that (it doesn&rsquo;t support in-place recovery, but it does enable in-place cleanup).&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><title type="html">Sized, DynSized, and Unsized</title><link href="https://smallcultfollowing.com/babysteps/blog/2024/04/23/dynsized-unsized/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2024/04/23/dynsized-unsized/</id><published>2024-04-23T00:00:00+00:00</published><updated>2024-04-23T16:51:54-04:00</updated><content type="html"><![CDATA[<p><a href="https://rust-lang.github.io/rfcs/1861-extern-types.html">Extern types</a> have been blocked for an unreasonably long time on a fairly narrow, specialized question: Rust today divides all types into two categories — <em>sized</em>, whose size can be statically computed, and <em>unsized</em>, whose size can only be computed at runtime. But for external types what we really want is a <em>third category</em>, types whose size can never be known, even at runtime (in C, you can model this by defining structs with an unknown set of fields). The problem is that Rust’s <code>?Sized</code> notation does not naturally scale to this third case. I think it’s time we fixed this. At some point I read a proposal — I no longer remember where — that seems like the obvious way forward and which I think is a win on several levels. So I thought I would take a bit of time to float the idea again, explain the tradeoffs I see with it, and explain why I think the idea is a good change.</p>
<h2 id="tldr-write-t-unsized-in-place-of-t-sized-and-sometimes-t-dynsized">TL;DR: write <code>T: Unsized</code> in place of <code>T: ?Sized</code> (and sometimes <code>T: DynSized</code>)</h2>
<p>The basic idea is to deprecate the <code>?Sized</code> notation and instead have a family of <code>Sized</code> supertraits. As today, the default is that every type parameter <code>T</code> gets a <code>T: Sized</code> bound unless the user explicitly chooses one of the other supertraits:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="sd">/// Types whose size is known at compilation time (statically).
</span></span></span><span class="line"><span class="cl"><span class="sd">/// Implemented by (e.g.) `u32`. References to `Sized` types
</span></span></span><span class="line"><span class="cl"><span class="sd">/// are &#34;thin pointers&#34; -- just a pointer.
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="nb">Sized</span>: <span class="nc">DynSized</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="sd">/// Types whose size can be computed at runtime (dynamically).
</span></span></span><span class="line"><span class="cl"><span class="sd">/// Implemented by (e.g.) `[u32]` or `dyn Trait`.
</span></span></span><span class="line"><span class="cl"><span class="sd">/// References to these types are &#34;wide pointers&#34;,
</span></span></span><span class="line"><span class="cl"><span class="sd">/// with the extra metadata making it possible to compute the size
</span></span></span><span class="line"><span class="cl"><span class="sd">/// at runtime.
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">DynSized</span>: <span class="nc">Unsized</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="sd">/// Types that may not have a knowable size at all (either statically or dynamically).
</span></span></span><span class="line"><span class="cl"><span class="sd">/// All types implement this, but extern types **only** implement this.
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Unsized</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Under this proposal, <code>T: ?Sized</code> notation could be converted to <code>T: DynSized</code> or <code>T: Unsized</code>. <code>T: DynSized</code> matches the current semantics precisely, but <code>T: Unsized</code> is probably what most uses actually want. This is because most users of <code>T: ?Sized</code> never compute the size of <code>T</code> but rather just refer to existing values of <code>T</code> by pointer.</p>
<h3 id="credit-where-credit-is-due">Credit where credit is due?</h3>
<p>For the record, this design is not my idea, but I&rsquo;m not sure where I saw it. I would appreciate a link so I can properly give credit.</p>
<h2 id="why-do-we-have-a-default-t-sized-bound-in-the-first-place">Why do we have a default <code>T: Sized</code> bound in the first place?</h2>
<p>It’s natural to wonder why we have this <code>T: Sized</code> default in the first place. The short version is that Rust would be very annoying to use without it. If the compiler doesn’t know the size of a value at compilation time, it cannot (at least, cannot easily) generate code to do a number of common things, such as store a value of type <code>T</code> on the stack or have structs with fields of type <code>T</code>. This means that a very large fraction of generic type parameters would wind up with <code>T: Sized</code>.</p>
<h2 id="so-why-the-sized-notation">So why the <code>?Sized</code> notation?</h2>
<p>The <code>?Sized</code> notation was the result of a lot of discussion. It satisfied a number of criteria.</p>
<h3 id="-signals-that-the-bound-operates-in-reverse"><code>?</code> signals that the bound operates in reverse</h3>
<p>The <code>?</code> is meant to signal that a bound like <code>?Sized</code> actually works in <strong>reverse</strong> from a normal bound. When you have <code>T: Clone</code>, you are saying “type <code>T</code> <strong>must</strong> implement <code>Clone</code>”. So you are <strong>narrowing</strong> the set of types that <code>T</code> could be: before, it could have been both types that implement <code>Clone</code> and those that do not. After, it can <em>only</em> be types that implement <code>Clone</code>. <code>T: ?Sized</code> does the reverse: before, it can <strong>only</strong> be types that implement <code>Sized</code> (like <code>u32</code>), but after, it can <strong>also</strong> be types that do not (like <code>[u32]</code> or <code>dyn Debug</code>). Hence the <code>?</code>, which can be read as “maybe” — i.e., <code>T</code> is “maybe” Sized.</p>
<h3 id="-can-be-extended-to-other-default-bounds"><code>?</code> can be extended to other default bounds</h3>
<p>The <code>?</code> notation also scales to other default traits. Although we’ve been reluctant to exercise this ability, we wanted to leave room to add a new default bound. This power will be needed if we ever adopt <a href="https://smallcultfollowing.com/babysteps/blog/2023/03/16/must-move-types/">“must move” types</a><sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> or add a bound like <code>?Leak</code> to signal a value that cannot be leaked.</p>
<h2 id="but--doesnt-scale-well-to-differences-in-degree">But <code>?</code> doesn’t scale well to “differences in degree”</h2>
<p>When we debated the <code>?</code> notation, we thought a lot about extensibility to other <em>orthogonal</em> defaults (like <code>?Leak</code>), but we didn’t consider extending a single dimension (like <code>Sized</code>) to multiple levels. There is no theoretical challenge. In principle we could say…</p>
<ul>
<li><code>T</code> means <code>T: Sized + DynSized</code></li>
<li><code>T: ?Sized</code> drops the <code>Sized</code> default, leaving <code>T: DynSized</code></li>
<li><code>T: ?DynSized</code> drops both, leaving any type <code>T</code></li>
</ul>
<p>…but I personally find that very confusing. To me, saying something “might be statically sized” does not signify that it <em>is</em> dynamically sized.</p>
<h2 id="and--looks-more-magical-than-it-needs-to">And <code>?</code> looks “more magical” than it needs to</h2>
<p>Despite knowing that <code>T: ?Sized</code> operates in reverse, I find that in practice it still <em>feels</em> very much like other bounds. Just like <code>T: Debug</code> gives the function the extra capability of generating debug info, <code>T: ?Sized</code> feels to me like it gives the function an extra capability: the ability to be used on unsized types. This logic is specious, these are different kinds of capabilities, but, as I said, it’s how I find myself thinking about it.</p>
<p>Moreover, even though I know that <code>T: ?Sized</code> “most properly” means “a type that may or may not be Sized”, I find it wind up <em>thinking</em> about it as “a type that is unsized”, just as I think about <code>T: Debug</code> as a “type that is <code>Debug</code>”. Why is that? Well, beacuse <code>?Sized</code> types <em>may</em> be unsized, I have to treat them as if they <em>are</em> unsized &ndash; i.e., refer to them only by pointer. So the fact that they <em>might</em> also be sized isn’t very relevant.</p>
<h2 id="how-would-we-use-these-new-traits">How would we use these new traits?</h2>
<p>So if we adopted the “family of sized traits” proposal, how would we use it? Well, for starters, the <code>size_of</code> methods would no longer be defined as <code>T</code> and <code>T: ?Sized</code>…</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">size_of</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="kt">usize</span> <span class="p">{}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">size_of_val</span><span class="o">&lt;</span><span class="n">T</span>: <span class="o">?</span><span class="nb">Sized</span><span class="o">&gt;</span><span class="p">(</span><span class="n">t</span>: <span class="kp">&amp;</span><span class="nc">T</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">usize</span> <span class="p">{}</span><span class="w">
</span></span></span></code></pre></div><p>… but instead as <code>T</code> and <code>T: DynSized</code> …</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">size_of</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="kt">usize</span> <span class="p">{}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">size_of_val</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">DynSized</span><span class="o">&gt;</span><span class="p">(</span><span class="n">t</span>: <span class="kp">&amp;</span><span class="nc">T</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">usize</span> <span class="p">{}</span><span class="w">
</span></span></span></code></pre></div><p>That said, most uses of <code>?Sized</code> today do not need to compute the size of the value, and would be better translated to <code>Unsized</code>…</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">Unsized</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Debug</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="o">&amp;</span><span class="n">T</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">fmt</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">f</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">std</span>::<span class="n">fmt</span>::<span class="n">Formatter</span><span class="o">&lt;</span><span class="nb">&#39;_</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="option-defaults-could-also-be-disabled-by-supertraits">Option: Defaults could also be disabled by supertraits?</h2>
<p>As an interesting extension to today’s system, we could say that every type parameter <code>T</code> gets an implicit <code>Sized</code> bound unless either…</p>
<ol>
<li>There is an explicit weaker alternative(like <code>T: DynSized</code> or <code>T: Unsized</code>);</li>
<li>Or some other bound <code>T: Trait</code> has an explicit supertrait <code>DynSized</code> or <code>Unsized</code>.</li>
</ol>
<p>This would clarify that trait aliases can be used to disable the <code>Sized</code> default. For example, today, one might create a <code>Value</code> trait is equivalent to <code>Debug + Hash + Org</code>, roughly like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Value</span>: <span class="nc">Debug</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">Hash</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Ord</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Note that `Self` is the *only* type parameter that does NOT get `Sized` by default
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="o">?</span><span class="nb">Sized</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">Debug</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">Hash</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Ord</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Value</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="p">{}</span><span class="w">
</span></span></span></code></pre></div><p>But what if, in your particular data structure, all values are boxed and hence can be unsized. Today, you have to repeat <code>?Sized</code> everywhere:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Tree</span><span class="o">&lt;</span><span class="n">V</span>: <span class="o">?</span><span class="nb">Sized</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">Value</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">value</span>: <span class="nb">Box</span><span class="o">&lt;</span><span class="n">V</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">children</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">Tree</span><span class="o">&lt;</span><span class="n">V</span><span class="o">&gt;&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">V</span>: <span class="o">?</span><span class="nb">Sized</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">Value</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Tree</span><span class="o">&lt;</span><span class="n">V</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="err">…</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>With this proposal, the <em>explicit</em> <code>Unsized</code> bound could be signaled on the trait:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Value</span>: <span class="nc">Debug</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">Hash</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Ord</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">Unsized</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Note that `Self` is the *only* type parameter that does NOT get `Sized` by default
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">Unsized</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">Debug</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">Hash</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Ord</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Value</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="p">{}</span><span class="w">
</span></span></span></code></pre></div><p>which would mean that</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Tree</span><span class="o">&lt;</span><span class="n">V</span>: <span class="nc">Value</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="err">…</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>would imply <code>V: Unsized</code>.</p>
<h2 id="alternatives">Alternatives</h2>
<h3 id="different-names">Different names</h3>
<p>The name of the <code>Unsized</code> trait in particular is a bit odd. It means “you can treat this type as unsized”, which is true of all types, but it <em>sounds</em> like the type is <em>definitely</em> unsized. I’m open to alternative names, but I haven’t come up with one I like yet. Here are some alternatives and the problems with them I see:</p>
<ul>
<li><code>Unsizeable</code> — doesn’t meet our typical name conventions, has overlap with the <code>Unsize</code> trait</li>
<li><code>NoSize</code>, <code>UnknownSize</code> — same general problem as <code>Unsize</code></li>
<li><code>ByPointer</code> — in some ways, I kind of like this, because it says “you can work with this type by pointer”, which is clearly true of all types. But it doesn’t align well with the existing <code>Sized</code> trait — what would we call that, <code>ByValue</code>? And it seems too tied to today’s limitations: there are, after all, ways that we can make <code>DynSized</code> types work by value, at least in some places.</li>
<li><code>MaybeSized</code> — just seems awkward, and should it be <code>MaybeDynSized</code>?</li>
</ul>
<p>All told, I think <code>Unsized</code> is the best name. It’s a <em>bit</em> wrong, but I think you can understand it, and to me it fits the intuition I have, which is that I mark type parameters as <code>Unsized</code> and then I tend to just think of them as being unsized (since I have to).</p>
<h3 id="some-sigil">Some sigil</h3>
<p>Under this proposal, the <code>DynSized</code> and <code>Unsized</code> traits are “magic” in that explicitly declaring them as a bound has the impact of disabling a default <code>T: Sized</code> bound. We could signify that in their names by having their name be prefixed with some sort of sigil. I’m not really sure what that sigil would be — <code>T: %Unsized</code>? <code>T: ?Unsized</code>? It all seems unnecessary.</p>
<h3 id="drop-the-implicit-bound-altogether">Drop the implicit bound altogether</h3>
<p>The purist in me is tempted to question whether we need the default bound. Maybe in Rust 2027 we should try to drop it altogether. Then people could write</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">size_of</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Sized</span><span class="o">&gt;</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="kt">usize</span> <span class="p">{}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">size_of_val</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">DynSized</span><span class="o">&gt;</span><span class="p">(</span><span class="n">t</span>: <span class="kp">&amp;</span><span class="nc">T</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">usize</span> <span class="p">{}</span><span class="w">
</span></span></span></code></pre></div><p>and</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Debug</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="o">&amp;</span><span class="n">T</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">fmt</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">f</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">std</span>::<span class="n">fmt</span>::<span class="n">Formatter</span><span class="o">&lt;</span><span class="nb">&#39;_</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Of course, it would also mean a lot of <code>Sized</code> bounds cropping up in surprising places. Beyond random functions, consider that every associated type today has a default <code>Sized</code> bound, so you would need</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span>: <span class="nb">Sized</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Overall, I doubt this idea is worth it. Not surprising: it was deemed too annoying before, and now it has the added problem of being hugely disruptive.</p>
<h2 id="conclusion">Conclusion</h2>
<p>I’ve covered a design to move away from <code>?Sized</code> bounds and towards specialized traits. There are avrious “pros and cons” to this proposal but one aspect in particular feels common to this question and many others: when do you make two “similar but different” concepts feel very different — e.g., via special syntax like <code>T: ?Sized</code> — and when do you make them feel very similar — e.g., via the idea of “special traits” where a bound like <code>T: Unsized</code> has extra meaning (disabling defaults).</p>
<p>There is a definite trade-off here. Distinct syntax help avoid potential confusion, but it forces people to recognize that something special is going on even when that may not be relevant or important to them. This can deter folks early on, when they are most “deter-able”. I think it can also contribute to a general sense of “big-ness” that makes it feel like understanding the entire language is harder.</p>
<p>Over time, I’ve started to believe that it’s generally better to make things feel similar, letting people push off the time at which they have to learn a new concept. In this case, this lessens my fears around the idea that <code>Unsized</code> and <code>DynSized</code> traits would be confusing because they behave differently than other traits. In this particular case, I also feel that <code>?Sized</code> doesn&rsquo;t &ldquo;scale well&rdquo; to default bounds where you want to pick from one of many options, so it&rsquo;s kind of the worst of both worlds &ndash; distinct syntax that shouts at you but which <em>also</em> fails to add clarity.</p>
<p>Ultimately, though, I’m not wedded to this idea, but I am interested in kicking off a discussion of how we can unblock <a href="https://rust-lang.github.io/rfcs/1861-extern-types.html">extern types</a>. I think by now we&rsquo;ve no doubt covered the space pretty well and we should pick a direction and go for it (or else just give up on <a href="https://rust-lang.github.io/rfcs/1861-extern-types.html">extern types</a>).</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>I still think <a href="https://smallcultfollowing.com/babysteps/blog/2023/03/16/must-move-types/">“must move” types</a> are a good idea — but that’s a topic for another post.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><title type="html">Ownership in Rust</title><link href="https://smallcultfollowing.com/babysteps/blog/2024/04/05/ownership-in-rust/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2024/04/05/ownership-in-rust/</id><published>2024-04-05T00:00:00+00:00</published><updated>2024-04-05T12:22:59-04:00</updated><content type="html"><![CDATA[<p>Ownership is an important concept in Rust — but I’m not talking about the type system. I’m talking about in our open source project. One of the big failure modes I’ve seen in the Rust community, especially lately, is the feeling that it’s unclear who is entitled to make decisions. Over the last six months or so, I’ve been developing a <a href="https://hackmd.io/@nikomatsakis/ByFkzn_10">project goals proposal</a>, which is an attempt to reinvigorate Rust’s roadmap process — and a key part of this is the idea of giving each goal an <strong>owner</strong>. I wanted to write a post just exploring this idea of being an owner: what it means and what it doesn’t.</p>
<h2 id="every-goal-needs-an-owner">Every goal needs an owner</h2>
<p>Under my proposal, the project will identify its top priority goals, and every goal will have a designated <strong>owner</strong>. This is ideally a single, concrete person, though it <em>can</em> be a small group. Owners are the ones who, well, own the design being proposed. Just like in Rust, when they own something, they have the power to change it.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<p>Just because owners own the design does not mean they work alone. Like any good Rustacean, they should <a href="https://lang-team.rust-lang.org/decision_process.html?highlight=treasure%20dissent#prioritized-principles-of-rust-team-consensus-decision-making">treasure dissent</a>, making sure that when a concern is raised, the owner fully understands it and does what they can to mitigate or address it. But there always comes a point where the tradeoffs have been laid on the table, <a href="https://smallcultfollowing.com/babysteps/blog/2019/04/19/aic-adventures-in-consensus/">the space has been mapped</a>, and somebody just has to make a call about what to do. This is where the owner comes in. Under project goals, the owner is the one we’ve chosen to do that job, and they should feel free to make decisions in order to keep things moving.</p>
<h2 id="teams-make-the-final-decision">Teams make the final decision</h2>
<p>Owners own the <strong>proposal</strong>, but they don’t decide whether the proposal gets accepted. That is the job of the <strong>team</strong>. So, if e.g. the goal in question requires making a change to the language, the language design team is the one that ultimately decides whether to accept the proposal.</p>
<p>Teams can ultimately overrule an owner: they can ask the owner to come back with a modified proposal that weighs the tradeoffs differently. This is right and appropriate, because teams are the ones we recognize as having the best broad understanding of the domain they maintain.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> But teams should use their power judiciously, because the owner is typically the one who understands the tradeoffs for this particular goal most deeply.</p>
<h2 id="ownership-is-empowerment">Ownership is empowerment</h2>
<p>Rust’s primary goal is <em>empowerment</em> — and that is as true for the open-source org as it is for the language itself. Our goal should be to <strong>empower people to improve Rust</strong>. That does not mean giving them unfettered ability to make changes — that would result in chaos, not an improved version of Rust — but when their vision is aligned with Rust’s values, we should ensure they have the capability and support they need to realize it.</p>
<h2 id="ownership-requires-trust">Ownership requires trust</h2>
<p>There is an interesting tension around ownership. Giving someone ownership of a goal is an act of faith — it means that we consider them to be an individual of high judgment who understands Rust and its values and will act accordingly. This implies to me that we are unlikely to take a goal if the owner is not known to the project. They don’t necessarily have to have worked on Rust, but they have to have enough of a reputation that we can evaluate whether they’re going to do a good job.’</p>
<p>The design of project goal proposals includes steps designed to increase trust. Each goal includes a set of <strong>design axioms</strong> identifying the key tradeoffs that are expected and how they will be weighed against one another. The goal also identifies <strong>milestones</strong>, which shows that the author has thought about how to breakup and approach the work incrementally.</p>
<p>It’s also worth highlighting that while the project has to trust the owner, the reverse is also true: the project hasn’t always done a good job of making good on its commitments. Sometimes we’ve asked for a proposal on a given feature and then not responded when it arrives.<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup> Or we set up <a href="https://smallcultfollowing.com/babysteps/blog/2019/07/10/aic-unbounded-queues-and-lang-design/">unbounded queues</a> that wind up getting overfull, resulting in long delays.</p>
<p>The project goal system has steps to build that kind of trust too: the owner identifies exactly the kind of support they expect to require from the team, and the team commits to provide it. Moreover, the general expectation is that any project goal represents an important priority, and so teams should prioritize nominated issues and the like that are related.</p>
<h2 id="trust-requires-accountability">Trust requires accountability</h2>
<p>Trust is something that has to be maintained over time. The primary mechanism for that in the project goal system is <strong>regular reporting</strong>. The idea is that, once we’ve identified a goal, we will create a tracking issue. Bots will prompt owners to give regular status updates on the issue. Then, periodically, we will post a blog post that aggregates these status updates. This gives us a chance to identify goals that haven’t been moving — or at least where no status update has been provided — and take a look as to see why.</p>
<p>In my view, it’s <strong>expected and normal that we will not make all our goals</strong>. Things happen. Sometimes owners get busy with other things. Other times, priorities change and what was once a goal no longer seems relevant. That’s fine, but we do want to be explicit about noticing it has happened. The problem is when we let things live in the dark, so that if you want to really know what’s going on, you have to conduct an exhaustive archaeological expedition through github comments, zulip threads, emails, and sometimes random chats and minutes.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Rust has strong values of being an open, participatory language. This is a good thing and a key part of how Rust has gotten as good as it is. <a href="https://nikomatsakis.github.io/rust-latam-2019/#98">Rust’s design does not belong to any one person.</a> A key part of how we enforce that is by making decisions by <strong>consensus</strong>.</p>
<p>But people sometimes get confused and think consensus means that everyone has to agree. This is wrong on two levels:</p>
<ul>
<li><strong>The team must be in consensus, not the RFC thread</strong>: in Rust’s system, it’s the teams that ultimately make the decision. There have been plenty of RFCs that the team decided to accept despite strong opposition from the RFC thread (e.g., the <code>?</code> operator comes to mind). This is right and good. The team has the most context, but the team also gets input from many other sources beyond the people that come to participate in the RFC thread.</li>
<li><strong><a href="https://lang-team.rust-lang.org/decision_process.html?highlight=consensus%20doesn%27t%20mean%20unanimity#prioritized-principles-of-rust-team-consensus-decision-making">Consensus doesn&rsquo;t mean unanimity:</a></strong> Being in consensus means that a majority agrees with the proposal and nobody thinks that it is definitely wrong. Plenty of proposals are decided where team members have significant, even grave, doubts. But ultimately tradeoffs must be made, and the team members trust one another’s judgment, so sometimes proposals go forward that aren’t made the way you would do it.</li>
</ul>
<p>The reality is that every good thing that ever got done in Rust had an owner &ndash; somebody driving the work to completion. But we&rsquo;ve never named those owners explicitly or given them a formal place in our structure. I think it&rsquo;s time we fixed that!</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Hat tip to Jack Huey for this turn of phrase. Clever guy.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>There is a common misunderstanding that being on a Rust team for a project X means you are the one authoring code for X. That’s not the role of a team member. Team members hold the overall design of X in their heads. They review changes and mentor contributors who are looking to make a change. Of course, team members do sometimes write code, too, but in that case they are playing the role of a (particularly knowledgable) contributor.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p><a href="https://github.com/rust-lang/rfcs/pull/2393#issuecomment-810421388">I still feel bad about delegation.</a>&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><title type="html">Borrow checking without lifetimes</title><link href="https://smallcultfollowing.com/babysteps/blog/2024/03/04/borrow-checking-without-lifetimes/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2024/03/04/borrow-checking-without-lifetimes/</id><published>2024-03-04T00:00:00+00:00</published><updated>2024-03-04T13:29:34-05:00</updated><content type="html"><![CDATA[<p>This blog post explores an alternative formulation of Rust&rsquo;s type system that eschews <em>lifetimes</em> in favor of <em>places</em>. The TL;DR is that instead of having <code>'a</code> represent a <em>lifetime</em> in the code, it can represent a set of <em>loans</em>, like <code>shared(a.b.c)</code> or <code>mut(x)</code>. If this sounds familiar, it should, it&rsquo;s the basis for <a href="https://smallcultfollowing.com/babysteps/blog/2023/09/22/polonius-part-1/">polonius</a>, but reformulated as a type system instead of a static analysis. This blog post is just going to give the high-level ideas. In follow-up posts I&rsquo;ll dig into how we can use this to support interior references and other advanced borrowing patterns. In terms of implementation, I&rsquo;ve mocked this up a bit, but I intend to start extending <a href="https://github.com/rust-lang/a-mir-formality">a-mir-formality</a> to include this analysis.</p>
<h2 id="why-would-you-want-to-replace-lifetimes">Why would you want to replace lifetimes?</h2>
<p>Lifetimes are the best and worst part of Rust. The best in that they let you express very cool patterns, like returning a pointer into some data in the middle of your data structure. But they&rsquo;ve got some serious issues. For one, the idea of what a lifetime is rather abstract, and hard for people to grasp (&ldquo;what does <code>'a</code> actually represent?&rdquo;). But also Rust is not able to express some important patterns, most notably interior references, where one field of a struct refers to data owned by another field.</p>
<h2 id="so-what-is-a-lifetime-exactly">So what <em>is</em> a lifetime exactly?</h2>
<p>Here is the definition of a lifetime from the RFC on non-lexical lifetimes:</p>
<blockquote>
<p>Whenever you create a borrow, the compiler assigns the resulting reference a lifetime. This lifetime corresponds to the span of the code where the reference may be used. The compiler will infer this lifetime to be the smallest lifetime that it can have that still encompasses all the uses of the reference.</p>
</blockquote>
<p><a href="https://rust-lang.github.io/rfcs/2094-nll.html#what-is-a-lifetime">Read the RFC for more details.</a></p>
<h2 id="replacing-a-lifetime-with-an-origin">Replacing a <em>lifetime</em> with an <em>origin</em></h2>
<p>Under this formulation, <code>'a</code> no longer represents a <em>lifetime</em> but rather an <strong>origin</strong> &ndash; i.e., it explains where the reference may have come from. We define an origin as a <strong>set of loans</strong>. Each loan captures some <strong>place expression</strong> (e.g. <code>a</code> or <code>a.b.c</code>), that has been borrowed along with the mode in which it was borrowed (<code>shared</code> or <code>mut</code>).</p>
<pre tabindex="0"><code>Origin = { Loan }

Loan = shared(Place)
     | mut(Place)

Place = variable(.field)*  // e.g., a.b.c
</code></pre><h2 id="defining-types">Defining types</h2>
<p>Using origins, we can define Rust types roughly like this (obviously I&rsquo;m ignoring a bunch of complexity here&hellip;):</p>
<pre tabindex="0"><code>Type = TypeName &lt; Generic* &gt;
     | &amp; Origin Type
     | &amp; Origin mut Type
     
TypeName = u32 (for now I&#39;ll ignore the rest of the scalars)
         | ()  (unit type, don&#39;t worry about tuples)
         | StructName
         | EnumName
         | UnionName

Generic = Type | Origin
</code></pre><p>Here is the first interesting thing to note: there is no <code>'a</code> notation here! This is because I&rsquo;ve not introduced generics yet. Unlike Rust proper, this formulation of the type system has a concrete syntax (<code>Origin</code>) for what <code>'a</code> represents.</p>
<h2 id="explicit-types-for-a-simple-program">Explicit types for a simple program</h2>
<p>Having a fully explicit type system also means we can easily write out example programs where all types are fully specified. This used to be rather challenging because we had no notation for lifetimes. Let&rsquo;s look at a simple example, a program that ought to get an error:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">counter</span>: <span class="kt">u32</span> <span class="o">=</span><span class="w"> </span><span class="mi">22_</span><span class="k">u32</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">p</span>: <span class="kp">&amp;</span> <span class="cm">/*{shared(counter)}*/</span><span class="w"> </span><span class="kt">u32</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">counter</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">//       ---------------------
</span></span></span><span class="line"><span class="cl"><span class="c1">//       no syntax for this today!
</span></span></span><span class="line"><span class="cl"><span class="n">counter</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w"> </span><span class="c1">// Error: cannot mutate `counter` while `p` is live
</span></span></span><span class="line"><span class="cl"><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;</span><span class="si">{p}</span><span class="s">&#34;</span><span class="p">);</span><span class="w">
</span></span></span></code></pre></div><p>Apart from the type of <code>p</code>, this is valid Rust. Of course, it won&rsquo;t compile, because we can&rsquo;t modify <code>counter</code> while there is a live shared reference <code>p</code> (<a href="https://play.rust-lang.org/?version=stable&amp;mode=debug&amp;edition=2021&amp;gist=1a05f0a4aad12c33345ca4adc1cd9bb2">playground</a>). As we continue, you will see how the new type system formulation arrives at the same conclusion.</p>
<h2 id="basic-typing-judgments">Basic typing judgments</h2>
<p>Typing judgments are the standard way to describe a type system. We&rsquo;re going to phase in the typing judgments for our system iteratively. We&rsquo;ll start with a simple, fairly standard formulation that doesn&rsquo;t include borrow checking, and then show how we introduce borrow checking. For this first version, the typing judgment we are defining has the form</p>
<pre tabindex="0"><code>Env |- Expr : Type
</code></pre><p>This says, &ldquo;in the environment <code>Env</code>, the expression <code>Expr</code> is legal and has the type <code>Type</code>&rdquo;. The <em>environment</em> <code>Env</code> here defines the local variables in scope. The Rust expressions we are looking at for our <a href="https://play.rust-lang.org/?version=stable&amp;mode=debug&amp;edition=2021&amp;gist=1a05f0a4aad12c33345ca4adc1cd9bb2">sample program</a> are pretty simple:</p>
<pre tabindex="0"><code>Expr = integer literal (e.g., 22_u32)
     | &amp; Place
     | Expr + Expr
     | Place (read the value of a place)
     | Place = Expr (overwrite the value of a place)
     | ...
</code></pre><p>Since we only support one scalar type (<code>u32</code>), the typing judgment for <code>Expr + Expr</code> is as simple as:</p>
<pre tabindex="0"><code>Env |- Expr1 : u32
Env |- Expr2 : u32
----------------------------------------- addition
Env |- Expr1 + Expr2 : u32
</code></pre><p>The rule for <code>Place = Expr</code> assignments is based on subtyping:</p>
<pre tabindex="0"><code>Env |- Expr : Type1
Env |- Place : Type2
Env |- Type1 &lt;: Type2
----------------------------------------- assignment
Env |- Place = Expr : ()
</code></pre><p>The rule for <code>&amp;Place</code> is somewhat more interesting:</p>
<pre tabindex="0"><code>Env |- Place : Type
----------------------------------------- shared references
Env |- &amp; Place : &amp; {shared(Place)} Type
</code></pre><p>The rule just says that we figure out the type of the place <code>Place</code> being borrowed (here, the place is <code>counter</code> and its type will be <code>u32</code>) and then we have a resulting reference to that type. The origin of that reference will be <code>{shared(Place)}</code>, indicating that the reference came from <code>Place</code>:</p>
<pre tabindex="0"><code>&amp;{shared(Place)} Type
</code></pre><h2 id="computing-liveness">Computing liveness</h2>
<p>To introduce borrow checking, we need to phase in the idea of <strong>liveness</strong>.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> If you&rsquo;re not familiar with the concept, the NLL RFC has a <a href="https://rust-lang.github.io/rfcs/2094-nll.html#liveness">nice introduction</a>:</p>
<blockquote>
<p>The term “liveness” derives from compiler analysis, but it’s fairly intuitive. We say that a variable is live if the current value that it holds may be used later.</p>
</blockquote>
<p>Unlike with NLL, where we just computed live <strong>variables</strong>, we&rsquo;re going to compute <strong>live places</strong>:</p>
<pre tabindex="0"><code>LivePlaces = { Place }
</code></pre><p>To compute the set of live places, we&rsquo;ll introduce a helper function <code>LiveBefore(Env, LivePlaces, Expr): LivePlaces</code>. <code>LiveBefore()</code> returns the set of places that are live before <code>Expr</code> is evaluated, given the environment <code>Env</code> and the set of places live after expression. I won&rsquo;t define this function in detail, but it looks roughly like this:</p>
<pre tabindex="0"><code>// `&amp;Place` reads `Place`, so add it to `LivePlaces`
LiveBefore(Env, LivePlaces, &amp;Place) =
    LivePlaces ∪ {Place}

// `Place = Expr` overwrites `Place`, so remove it from `LivePlaces`
LiveBefore(Env, LivePlaces, Place = Expr) =
    LiveBefore(Env, (LivePlaces - {Place}), Expr)

// `Expr1` is evaluated first, then `Expr2`, so the set of places
// live after expr1 is the set that are live *before* expr2
LiveBefore(Env, LivePlaces, Expr1 + Expr2) =
    LiveBefore(Env, LiveBefore(Env, LivePlaces, Expr2), Expr1)
    
... etc ...
</code></pre><h2 id="integrating-liveness-into-our-typing-judgments">Integrating liveness into our typing judgments</h2>
<p>To detect borrow check errors, we need to adjust our typing judgment to include liveness. The result will be as follows:</p>
<pre tabindex="0"><code>(Env, LivePlaces) |- Expr : Type
</code></pre><p>This judgment says, &ldquo;in the environment <code>Env</code>, and given that the function will access <code>LivePlaces</code> in the future, <code>Expr</code> is valid and has type <code>Type</code>&rdquo;. Integrating liveness in this way gives us some idea of what accesses will happen in the future.</p>
<p>For compound expressions, like <code>Expr1 + Expr2</code>, we have to adjust the set of live places to reflect control flow:</p>
<pre tabindex="0"><code>LiveAfter1 = LiveBefore(Env, LiveAfter2, Expr2)
(Env, LiveAfter1) |- Expr1 : u32
(Env, LiveAfter2) |- Expr2 : u32
----------------------------------------- addition
(Env, LiveAfter2) |- Expr1 + Expr2 : u32
</code></pre><p>We start out with <code>LiveAfter2</code>, i.e., the places that are live after the entire expression. These are also the same as the places live after expression 2 is evaluated, since this expression doesn&rsquo;t itself reference or overwrite any places. We then compute <code>LiveAfter1</code> &ndash; i.e., the places live after <code>Expr1</code> is evaluated &ndash; by looking at the places that are live <em>before</em> <code>Expr2</code>. This is a bit mind-bending and took me a bit of time to see. The tricky bit here is that liveness is computed <em>backwards</em>, but most of our typing rules (and intution) tends to flow <em>forwards</em>. If it helps, think of the &ldquo;fully desugared&rdquo; version of <code>+</code>:</p>
<pre tabindex="0"><code>let tmp0 = &lt;Expr1&gt;
    // &lt;-- the set LiveAfter1 is live here (ignoring tmp0, tmp1)
let tmp1 = &lt;Expr2&gt;
    // &lt;-- the set LiveAfter2 is live here (ignoring tmp0, tmp1)
tmp0 + tmp1
    // &lt;-- the set LiveAfter2 is live here
</code></pre><h2 id="borrow-checking-with-liveness">Borrow checking with liveness</h2>
<p>Now that we know liveness information, we can use it to do borrow checking. We&rsquo;ll introduce a &ldquo;permits&rdquo; judgment:</p>
<pre tabindex="0"><code>(Env, LiveAfter) permits Loan
</code></pre><p>that indicates that &ldquo;taking the loan Loan would be allowed given the environment and the live places&rdquo;. Here is the rule for assignments, modified to include liveness and the new &ldquo;permits&rdquo; judgment:</p>
<pre tabindex="0"><code>(Env, LiveAfter - {Place}) |- Expr : Type1
(Env, LiveAfter) |- Place : Type2
(Env, LiveAfter) |- Type1 &lt;: Type2
(Env, LiveAfter) permits mut(Place)
----------------------------------------- assignment
(Env, LiveAfter) |- Place = Expr : ()
</code></pre><p>Before I dive into how we define &ldquo;permits&rdquo;, let&rsquo;s go back to our example and get an intution for what is going on here. We want to declare an error on this assigment:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">counter</span>: <span class="kt">u32</span> <span class="o">=</span><span class="w"> </span><span class="mi">22_</span><span class="k">u32</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">p</span>: <span class="kp">&amp;</span><span class="p">{</span><span class="n">shared</span><span class="p">(</span><span class="n">counter</span><span class="p">)}</span><span class="w"> </span><span class="kt">u32</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">counter</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">counter</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w"> </span><span class="c1">// &lt;-- Error
</span></span></span><span class="line"><span class="cl"><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;</span><span class="si">{p}</span><span class="s">&#34;</span><span class="p">);</span><span class="w"> </span><span class="c1">// &lt;-- p is live
</span></span></span></code></pre></div><p>Note that, because of the <code>println!</code> on the next line, <code>p</code> will be in our <code>LiveAfter</code> set. Looking at the type of <code>p</code>, we see that it includes the loan <code>shared(counter)</code>. The idea then is that mutating counter is illegal because there is a live loan <code>shared(counter)</code>, which implies that <code>counter</code> must be immutable.</p>
<p>Restating that intution:</p>
<blockquote>
<p>A set <code>Live</code> of live places <em>permits</em> a loan <code>Loan1</code> if, for every live place <code>Place</code> in <code>Live</code>, the loans in the type of <code>Place</code> are compatible with <code>Loan1</code>.</p>
</blockquote>
<p>Written more formally:</p>
<pre tabindex="0"><code>∀ Place ∈ Live {
    (Env, Live) |- Place : Type
    ∀ Loan2 ∈ Loans(Type) { Compatible(Loan1, Loan2) }
}
-----------------------------------------
(Env, Live) permits Loan1
</code></pre><p>This definition makes use of two helper functions:</p>
<ul>
<li><code>Loans(Type)</code> &ndash; the set of loans that appear in the type</li>
<li><code>Compatible(Loan1, Loan2)</code> &ndash; defines if two loans are compatible. Two shared loans are always compatible. A mutable loan is only compatible with another loan if the places are disjoint.</li>
</ul>
<h2 id="conclusion">Conclusion</h2>
<p>The goal of this post was to give a high-level intution. I wrote it from memory, so I&rsquo;ve probably overlooked a thing or two. In follow-up posts though I want to go deeper into how the system I&rsquo;ve been playing with works and what new things it can support. Some high-level examples:</p>
<ul>
<li>How to define subtyping, and in particular the role of liveness in subtyping</li>
<li>Important borrow patterns that we use today and how they work in the new system</li>
<li>Interior references that point at data owned by other struct fields and how it can be supported</li>
</ul>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>If this is not obvious to you, don&rsquo;t worry, it wasn&rsquo;t obvious to me either. It turns out that using liveness in the rules is the key to making them simple. I&rsquo;ll try to write a follow-up about the alternatives I explored and why they don&rsquo;t work later on.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>]]></content></entry><entry><title type="html">What I'd like to see for Async Rust in 2024 🎄</title><link href="https://smallcultfollowing.com/babysteps/blog/2024/01/03/async-rust-2024/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2024/01/03/async-rust-2024/</id><published>2024-01-03T00:00:00+00:00</published><updated>2024-01-03T18:01:33-05:00</updated><content type="html"><![CDATA[<p>Well, it&rsquo;s that time of year, when thoughts turn to&hellip;well, Rust of course. I guess that&rsquo;s every time of year. This year was a pretty big year for Rust, though I think a lot of what happened was more in the vein of &ldquo;setting things up for success in 2024&rdquo;. So let&rsquo;s talk about 2024! I&rsquo;m going to publish a series of blog posts about different aspects of Rust I&rsquo;m excited about, and what I think we should be doing. To help make things concrete, I&rsquo;m going to frame the 2024 by using proposed <a href="https://smallcultfollowing.com/babysteps/blog/2023/11/28/project-goals/">project goals</a> &ndash; basically a specific piece of work I think we can get done this year. In this first post, I&rsquo;ll focus on <strong>async Rust</strong>.</p>
<h2 id="what-we-did-in-2023">What we did in 2023</h2>
<p>On Dec 28, with the <a href="https://blog.rust-lang.org/2023/12/28/Rust-1.75.0.html#async-fn-and-return-position-impl-trait-in-traits">release of Rust 1.75.0</a>, we <a href="https://blog.rust-lang.org/2023/12/21/async-fn-rpit-in-traits.html">stabilized async fn and impl trait in traits</a>. This is a <strong>really big deal</strong>. Async fn in traits has been <a href="https://smallcultfollowing.com/babysteps/blog/2019/10/26/async-fn-in-traits-are-hard/">&ldquo;considered hard&rdquo;</a> since 2019 and they&rsquo;re at the foundation of basically <em>everything</em> that we need to do to make async better.</p>
<p>Async Rust to me showcases the best and worst of Rust. It delivers on that Rust promise of &ldquo;high-level code, low-level performance&rdquo;. Building on the highly tuned <a href="https://tokio.rs/">Tokio runtime</a>, network services in Rust consistently have tighter tail latency and lower memory usage, which means you can service a lot more clients with a lot less resources. Alternatively, because Rust doesn&rsquo;t hardcode the runtime, you can write async Rust code that targets <a href="https://github.com/embassy-rs/embassy">embedded environments that don&rsquo;t even have an underlying operating system</a>, or anywhere in between.</p>
<p>And yet it continues to be true that, in the words of an Amazon engineer I talked to, &ldquo;Async Rust is Rust on hard mode&rdquo;. Truly closing this gap requires work in the language, standard library, and the ecosystem. We won&rsquo;t get all the way there in 2024, but I think we can make some big strides.</p>
<h2 id="proposed-goal-solve-the-send-bound-problem-in-q2"><em>Proposed goal:</em> Solve the <a href="https://smallcultfollowing.com/babysteps/blog/2023/02/01/async-trait-send-bounds-part-1-intro/">send bound problem</a> in Q2</h2>
<p>We made a lot of progress on async functions in traits last year, but we still can&rsquo;t <a href="https://blog.rust-lang.org/2023/12/21/async-fn-rpit-in-traits.html#async-fn-in-public-traits">cover the use case of generic traits that can be used either with a work-stealing executor or without one</a>. One very specific example of this is the <a href="https://docs.rs/tower/latest/tower/trait.Service.html"><code>Service</code> trait from <code>tower</code></a>. To handle this use case, we need a solution to the <a href="https://smallcultfollowing.com/babysteps/blog/2023/02/01/async-trait-send-bounds-part-1-intro/">send bound problem</a>. We have a bunch of idea for what this might be, and we&rsquo;ve even got a prototype implementation for (a subset of) <a href="https://smallcultfollowing.com/babysteps/blog/2023/02/13/return-type-notation-send-bounds-part-2/">return type notation</a>, so we are well positioned for success. I think we should aim to finish this by the end of Q2 (summer, basically). This in turn would unblock a 1.0 release of the <a href="https://crates.io/crates/tower">tower</a> crate, letting us having a stable trait for middleware.</p>
<h2 id="proposed-goal-stabilize-an-mvp-for-async-closures-in-q3"><em>Proposed goal:</em> Stabilize an MVP for async closures in Q3</h2>
<p>The holy grail for async is that you should be able to easily make any synchronous function into an asynchronous one. The 2019 MVP supported only top-level functions and inherent methods. We&rsquo;ve now extended that to include trait methods. In 2024, we should take the next step and support async closures. This will allow people to define combinator methods like iterator map and so forth and avoid the convoluted workarounds currently required.</p>
<p>For this first goal, I think we should be working to establish an <strong>MVP</strong>. Recently, <a href="https://github.com/compiler-errors">Errs</a> and I outlined an MVP we thought seemed quite doable. It began with creating <code>AsyncFn</code> traits that look that mirror the <code>Fn</code> trait hierarchy&hellip;</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">AsyncFnOnce</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Output</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">call_once</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">args</span>: <span class="nc">A</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Output</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">AsyncFnMut</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span>: <span class="nc">AsyncFnOnce</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">call_mut</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">args</span>: <span class="nc">A</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Output</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">AsyncFn</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span>: <span class="nc">AsyncFnMut</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">call</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">args</span>: <span class="nc">A</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Output</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>&hellip;and the ability to write async closures like <code>async || &lt;expr&gt;</code>, as well as a bridge such that any function that returns a future also implements the appropiate <code>AsyncFn</code> traits. Async clsoures would unblock us from creating combinator traits, like a truly nice version of async iterators.</p>
<p>This MVP is not intended as the final state, but it is intended to be compatible with whatever final state we wind up with. There remains a really interesing question about how to integrate the <code>AsyncFn</code> traits with the regular <code>Fn</code> traits. Nonetheless, I think we can stabilize the above MVP in parallel with exploring that question.</p>
<h2 id="proposed-goal-author-an-rfc-for-maybe-async-in-q4-or-decide-not-to"><em>Proposed goal:</em> Author an RFC for &ldquo;maybe async&rdquo; in Q4 (or decide not to!)</h2>
<p>One of the big questions around async is whether we should be supporting some way to write &ldquo;maybe async&rdquo; code. This idea has gone through a lot of names. Yosh and Oli originally kicked off something they called <a href="https://blog.rust-lang.org/inside-rust/2022/07/27/keyword-generics.html">keyword generics</a> and later rebranded as <em>effect generics</em>. I prefer the framing of <a href="https://smallcultfollowing.com/babysteps/blog/2023/03/03/trait-transformers-send-bounds-part-3/">trait transformers</a>, and I wrote a blog post about how <a href="https://smallcultfollowing.com/babysteps/blog/2023/03/29/thoughts-on-async-closures/">trait transformers can make async closures fit nicely</a>.</p>
<p>There is significant skepticism about whether this is a good direction. There are <a href="https://smallcultfollowing.com/babysteps/blog/2023/05/09/giving-lending-and-async-closures/">other ways to think about async closures</a> (though <a href="https://github.com/compiler-errors">Errs</a> pointed out an issue with this that I hope to write about in a future post). Boats has written a number of blog posts with concerns, and members of the types team have expressed fear about what will be required to write code that is generic over effects. These concerns make a lot of sense to me!</p>
<p>Overall, I still believe that something like trait transformers could make Rust feel simpler <em>and</em> help us scale to future needs. But I think we have to prove our case! My goal for 2024 then is to do exactly that. The idea would be to author an RFC laying out a &ldquo;maybe async&rdquo; scheme and to get that RFC accepted. To address the concerns of the types team, I think that will require modeling &ldquo;maybe async&rdquo; formally as part of <a href="https://github.com/rust-lang/a-mir-formality">a-mir-formality</a>, so that everybody can understand how it will work.</p>
<p>Another possible outcome here is that we opt to abandon the idea. Maybe the complexity really is infeasible. Or maybe the lang design doesn&rsquo;t feel right. I&rsquo;m good with that too, but either way, I think we need to settle on a plan this year.</p>
<h2 id="stretch-goal-stabilize-generator-syntax"><em>Stretch goal:</em> stabilize generator syntax</h2>
<p>As a stretch goal, it would be really cool to land support for generator expressions &ndash; basically a way to write async iterators. <a href="https://github.com/compiler-errors">Errs</a> recently <a href="https://github.com/rust-lang/rust/pull/118420">opened a PR</a> adding nightly support for async and <a href="https://github.com/rust-lang/rfcs/pull/3513">RFC #3513</a> proposed reserving the <code>gen</code> keyword for Rust 2024. Really <em>stabilizing</em> generators however requires us to answer some interesting questions about the best design for the async iteration trait. Thanks to the stabilization of async fn in trait, we can now have this conversation &ndash; and we have certainly been having it! Over the last month or so there has also been a <a href="https://without.boats/blog/poll-next/">lot</a> of <a href="https://blog.yoshuawuyts.com/async-iterator-trait/">interesting</a> <a href="https://tmandry.gitlab.io/blog/posts/for-await-buffered-streams/">back</a> and <a href="https://without.boats/blog/poll-progress/">forth</a> about the best setup. I&rsquo;m still digesting all the posts, I hope to put up some thoughts this month (no promises). Regardless, I think it&rsquo;s plausible that we could see async genreators land in 2024, which would be great, as it would eliminate the major reason that people have to interact directly with <code>Pin</code>.</p>
<h2 id="conclusion-looking-past-2024">Conclusion: looking past 2024</h2>
<p>If we accomplish the goals I outlined above, async Rust by the end of 2024 will be much improved. But there will still be a few big items before we can really say that we&rsquo;ve laid out the pieces we need. Sadly, we can&rsquo;t do it all, so these items would have to wait until after 2024, though I think we will continue to experiment and discuss their design:</p>
<ul>
<li><strong>Async drop</strong>: Once we have async closures, there remains one place where you cannot write an async function &ndash; the <code>Drop</code> trait. Async drop has a bunch of interesting complications (<a href="https://sabrinajewson.org/blog/async-drop">Sabrina wrote a great blog post on this!</a>), but it is also a <a href="https://rust-lang.github.io/wg-async/vision/submitted_stories/status_quo/alan_finds_database_drops_hard.html">major pain point for users</a>. We&rsquo;ll get to it!</li>
<li><strong>Dyn async trait</strong>: Besides <a href="https://smallcultfollowing.com/babysteps/blog/2023/02/01/async-trait-send-bounds-part-1-intro/">send bounds</a>, the <a href="https://blog.rust-lang.org/2023/12/21/async-fn-rpit-in-traits.html#dynamic-dispatch">other major limitation for async fn in trait</a> is that traits using them do not yet support dynamic dispatch. We should absolutely lift this, but to me it&rsquo;s lower in priority because there is an existing workaround of using a proc-macro to create a <code>DynAsyncTrait</code> type. It&rsquo;s not ideal, but it&rsquo;s not as fundamental a limitation as send bounds or the lack of async closures and async drop. (That said, the design work for this is largely done, so it is entirely possible that we land it this year as a drive-by piece of work.)</li>
<li><strong>Traits for being generic over runtimes</strong>: Async Rust&rsquo;s ability to support runtimes as varied as <a href="https://tokio.rs/">Tokio</a> and <a href="https://github.com/embassy-rs/embassy">Embassy</a> is one of its superpowers. But the fact that <a href="https://rust-lang.github.io/wg-async/vision/submitted_stories/status_quo/barbara_wishes_for_easy_runtime_switch.html">switching runtimes</a> or <a href="https://rust-lang.github.io/wg-async/vision/submitted_stories/status_quo/barbara_writes_a_runtime_agnostic_lib.html">writing code that is generic over what runtime it uses</a> is very hard to impossible is a key pain point, made even worse by the fact that <a href="https://rust-lang.github.io/wg-async/vision/submitted_stories/status_quo/alan_started_trusting_the_rust_compiler_but_then_async.html">runtimes often don&rsquo;t play nice together</a>. We need to build out traits for interop, starting with [async read + write] but eventually covering [task spawning and timers].</li>
<li><strong>Better APIs</strong>: Many of the nastiest async Rust bugs come about when users are trying to manage nested tasks. Existing APIs like <a href="https://rust-lang.github.io/wg-async/vision/submitted_stories/status_quo/barbara_battles_buffered_streams.html">FutureUnordered</a> and <a href="https://rust-lang.github.io/wg-async/vision/submitted_stories/status_quo/barbara_gets_burned_by_select.html">select</a> have a lot of rough edges and can <a href="https://rust-lang.github.io/wg-async/vision/submitted_stories/status_quo/aws_engineer/solving_a_deadlock.html">easily lead to deadlock</a> &ndash; <a href="https://tmandry.gitlab.io/blog/posts/for-await-buffered-streams/">Tyler had a good post on this</a>. I would like to see us take a fresh look at the async APIs we offer Rust programmers and build up a powerful, easy to use library that helps steer people away from potential sources of deadlock. Ideally this API would not be specific to the underlying runtime, but instead let users switch between different runtimes, and hopefully cleanly support embedded systems (perhaps with limited functionality). I don&rsquo;t think we know how to do this yet, and I think that doing it will require us to have a lot more tools (things like send bounds, async closure, and quite possibly trait transformers or async drop).</li>
</ul>]]></content></entry><entry><title type="html">Being Rusty: Discovering Rust's design axioms</title><link href="https://smallcultfollowing.com/babysteps/blog/2023/12/07/rust-design-axioms/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2023/12/07/rust-design-axioms/</id><published>2023-12-07T00:00:00+00:00</published><updated>2023-12-07T08:46:19-05:00</updated><content type="html"><![CDATA[<p>To your average Joe, being &ldquo;rusty&rdquo; is not seen as a good thing.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> But readers of this blog know that being <em>R</em>usty &ndash; with a capitol <em>R</em>! &ndash; is, of course, something completely different! So what is that makes Rust <em>Rust</em>? Our slogans articulate key parts of it, like <em>fearless concurrency</em>, <em>stability without stagnation</em>, or the epic <em>Hack without fear</em>. And there is of course Lindsey Kuper&rsquo;s <a href="https://www.youtube.com/watch?t=52&amp;v=DSR7EHeySlw&amp;feature=youtu.be">epic haiku</a>: &ldquo;A systems language / pursuing the trifecta: / fast, concurrent, safe&rdquo;. But I feel like we&rsquo;re still missing a unified set of axioms that we can refer back to over time and use to guide us as we make decisions. Some of you will remember the <a href="https://github.com/nikomatsakis/rustacean-principles">Rustacean Principles</a>, which was my first attempt at this. I&rsquo;ve been dissatisfied with them for a couple of reasons, so I decided to try again. The structure is really different, so I&rsquo;m calling it Rust&rsquo;s <em>design axioms</em>. This post documents the current state &ndash; I&rsquo;m quite a bit happier with it! But it&rsquo;s not quite there yet. So I&rsquo;ve also got a link to a <a href="https://github.com/nikomatsakis/rust-design-axioms">repository</a> where I&rsquo;m hoping people can help improve them by opening issues with examples, counter-examples, or other thoughts.</p>
<h2 id="axioms-capture-the-principles-you-use-in-your-decision-making-process">Axioms capture the principles you use in your decision-making process</h2>
<p>What I&rsquo;ve noticed is that when I am trying to make some decision &ndash; whether it&rsquo;s a question of language design or something else &ndash; I am implicitly bringing assumptions, intuitions, and hypotheses to bear. Oftentimes, those intutions fly by very quickly in my mind, and I barely even notice them. <em>Ah yeah, we could do X, but if we did that, it would mean Y, and I don&rsquo;t want that, scratch that idea.</em> I&rsquo;m slowly learning to be attentive to these moments &ndash; whatever <em>Y</em> is right there, it&rsquo;s related to one of my <strong>design axioms</strong> &mdash; something I&rsquo;m implicitly using to shape my thinking.</p>
<p>I&rsquo;ve found that if I can capture those axioms and write them out, they can help me down the line when I&rsquo;m facing future decisions. It can also help to bring alignment to a group of people by making those intutions explicit (and giving people a chance to refute or sharpen them). Obviously I&rsquo;m not the first to observe this. I&rsquo;ve found Amazon&rsquo;s practice of using <a href="https://aws.amazon.com/blogs/enterprise-strategy/tenets-supercharging-decision-making/">tenets</a> to be quite useful<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>, for example, and I&rsquo;ve also been inspired by things I&rsquo;ve read online about the importance of making your hypotheses explicit.<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></p>
<p>In proof systems, your <em>axioms</em> are the things that you assert to be true and take on faith, and from which the rest of your argument follows. I choose to call these Rust&rsquo;s <em>design axioms</em> because that seemed like exactly what I was going for. What are the starting assumptions that, followed to their conclusion, lead you to Rust? The more clearly we can articulate those assumptions, the better we&rsquo;ll be able to ensure that we continue to follow them as we evolve Rust to meet future needs.</p>
<h2 id="axioms-have-a-hypothesis-and-a-consequence">Axioms have a hypothesis and a consequence</h2>
<p>I&rsquo;ve structured the axioms in a particular way. They begin by stating the <strong>axiom</strong> itself &ndash; the core belief that we assert to be true. That is followed by a <strong>consequence</strong>, which is something that we do as a result of that core belief. To show you what I mean, here is one of the Rust design axioms I&rsquo;ve drafted:</p>
<blockquote>
<p><strong>Rust users want to surface problems as early as possible,</strong> and so Rust is designed to be <strong>reliable</strong>. We make choices that help surface bugs earlier. We don&rsquo;t make guesses about what our users meant to do, we let them tell us, and we endeavor to make the meaning of code transparent to its reader. And we always, always guarantee memory safety and data-race freedom in safe Rust code.</p>
</blockquote>
<h2 id="axioms-have-an-ordering-and-earlier-things-take-priority">Axioms have an ordering and earlier things take priority</h2>
<p>Each axiom is useful on its own, but where things become interesting is when they come into conflict. Consider reliability: that is a core axiom of Rust, no doubt, but is it the most important? I would argue it is not. If it were, we wouldn&rsquo;t permit unsafe code, or at least not without a safety proof. I think our core axiom is actually that Rust is is meant to be used, and used for building a particular kind of program. I articulated it like this:</p>
<blockquote>
<p><strong>Rust is meant to empower <em>everyone</em> to build reliable and efficient software,</strong> so above all else, Rust needs to be <strong>accessible</strong> to a broad audience. We avoid designs that will be too complex to be used in practice. We build supportive tooling that not only points out potential mistakes but helps users understand and fix them.</p>
</blockquote>
<p>When it comes to safety, I think Rust&rsquo;s approach is eminently practical. We&rsquo;ve designed a safe type system that we believe covers 90-95% of what people need to do,  and we are always working to expand that scope. We to get that last 5-10%, we fallback to unsafe code. Is this as safe and reliable as it could be? No. That would be requiring 100% proofs of correctness. There are systems that do that, but they are maintained by a <a href="http://web1.cs.columbia.edu/~junfeng/09fa-e6998/papers/sel4.pdf">small handful of experts</a>, and that idea &ndash; that systems programming is just for &ldquo;wizards&rdquo; &ndash; is exactly what we are trying to get away from.</p>
<p>To express this in our axioms, we put <strong>accessible</strong> as the top-most axiom. It defines the mission overall. But we put <strong>reliability</strong> as the second in the list, since that takes precedence over everything else.</p>
<h2 id="the-design-axioms-i-really-like">The design axioms I really like</h2>
<p>Without further ado, here is my current list design axioms. Well, part of it. These are the axioms that I feel pretty good about it. The ordering also feels right to me.</p>
<blockquote>
<p>We believe that&hellip;</p>
<ul>
<li><strong>Rust is meant to empower <em>everyone</em> to build reliable and efficient software,</strong> so above all else, Rust needs to be <strong>accessible</strong> to a broad audience. We avoid designs that will be too complex to be used in practice. We build supportive tooling that not only points out potential mistakes but helps users understand and fix them.</li>
<li><strong>Rust users want to surface problems as early as possible,</strong> and so Rust is designed to be <strong>reliable</strong>. We make choices that help surface bugs earlier. We don&rsquo;t make guesses about what our users meant to do, we let them tell us, and we endeavor to make the meaning of code transparent to its reader. And we always, always guarantee memory safety and data-race freedom in safe Rust code.</li>
<li><strong>Rust users are just as obsessed with quality as we are,</strong> and so Rust is <strong>extensible</strong>. We empower our users to build their own abstractions. We prefer to let people build what they need than to try (and fail) to give them everything ourselves.</li>
<li><strong>Systems programmers need to know what is happening and where,</strong> and so system details and especially performance costs in Rust are <strong>transparent and tunable</strong>. When building systems, it&rsquo;s often important to know what&rsquo;s going on underneath the abstractions. Abstractions should still leave the programmer feeling like they&rsquo;re in control of the underlying system, such as by making it easy to notice (or avoid) certain types of operations.</li>
</ul>
<p>&hellip;where earlier things take precedence.</p>
</blockquote>
<h2 id="the-design-axioms-that-are-still-a-work-in-progress">The design axioms that are still a work-in-progress</h2>
<p>These axioms are things I am less sure of. It&rsquo;s not that I don&rsquo;t think they are true. It&rsquo;s that I don&rsquo;t know yet if they&rsquo;re worded correctly. Maybe they should be combined together? And where, exactly, do they fall in the ordering?</p>
<blockquote>
<ul>
<li><strong>Rust users want to focus on solving their problem, not the fiddly details,</strong> so Rust is <strong>productive</strong>. We favor APIs that where the most convenient and high-level option is also the most efficient one. We support portability across operating systems and execution environments by default. We aren&rsquo;t explicit for the sake of being explicit, but rather to surface details we believe are needed.</li>
<li><strong>N✕M is bigger than N+M</strong>, and so we design for <strong>composability and orthogonality</strong>. We are looking for features that tackle independent problems and build on one another, giving rise to N✕M possibilities.</li>
<li><strong>It&rsquo;s nicer to use one language than two,</strong> so Rust is <strong>versatile</strong>. Rust can&rsquo;t be the best at everything, but we can make it decent for just about anything, whether that&rsquo;s low-level C code or high-level scripting.</li>
</ul>
</blockquote>
<p>Of these, I like the first one best. Also, it follows the axiom structure better, because it starts with a hypothesis about Rust users and what they want. The other two are a bit older and I hadn&rsquo;t adopted that convention yet.</p>
<h2 id="help-shape-the-axioms">Help shape the axioms!</h2>
<p>My ultimate goal is to author an RFC endorsing these axioms for Rust. But I need help to get there. Are these the right axioms? Am I missing things? Should we change the ordering?</p>
<p>I&rsquo;d love to know what you think! To aid in collaboration, I&rsquo;ve created a <a href="https://github.com/nikomatsakis/rust-design-axioms">nikomatsakis/rust-design-axioms</a> github repository. It <a href="https://nikomatsakis.github.io/rust-design-axioms/intro.html">hosts the current state of the axioms</a> and also has <a href="https://nikomatsakis.github.io/rust-design-axioms/contributing.html">suggested ways to contribute</a>.</p>
<p>I&rsquo;ve already opened <a href="https://github.com/nikomatsakis/rust-design-axioms/issues">issues</a> for some of the things I am wondering about, such as:</p>
<ul>
<li><a href="https://github.com/nikomatsakis/rust-design-axioms/issues/1">nikomatsakis/rust-design-axioms#1</a>: Maybe we need a &ldquo;performant&rdquo; axiom? Right now, the idea of &ldquo;zero-cost abstractions&rdquo; and &ldquo;&ldquo;the default thing is also the most efficient one&rdquo; feels a bit smeared across &ldquo;transparent and tunable&rdquo; and &ldquo;productive&rdquo;.</li>
<li><a href="https://github.com/nikomatsakis/rust-design-axioms/issues/2">nikomatsakis/rust-design-axioms#2</a>: Is &ldquo;portability&rdquo; sufficiently important to pull out from &ldquo;productivity&rdquo; into its own axiom?</li>
<li><a href="https://github.com/nikomatsakis/rust-design-axioms/issues/3">nikomatsakis/rust-design-axioms#3</a>: Are &ldquo;versatility&rdquo; and &ldquo;orthogonality&rdquo; really expressing something different from &ldquo;productivity&rdquo;?</li>
</ul>
<p><a href="https://nikomatsakis.github.io/rust-design-axioms/">Check it out!</a></p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>I have a Google alert for &ldquo;Rust&rdquo; and I cannot tell you how often it seems that some sports teams or another shakes off Rust. I&rsquo;d never heard that expression before signing up for this Google alert.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>I&rsquo;m perhaps a bit unusual in my love for things like Amazon&rsquo;s <a href="https://www.amazon.jobs/content/en/our-workplace/leadership-principles">Leadership Principles</a>. I can totally understand why, to many people, they seem like corporate nonsense. But if there&rsquo;s one theme I&rsquo;ve seen consistenly over my time working on Rust, it&rsquo;s that <em>process and structure are essential</em>. Take a look at the <a href="https://youtu.be/J9OFQm8Qf1I?si=0L6jkbD501-_ACka">&ldquo;People Systems&rdquo; keynote that Aaron, Ashley, and I gave at RustConf 2018</a> and you will see that theme running throughout. So many of Rust&rsquo;s greatest practices &ndash; things like the teams or RFCs or public, rfcbot-based decision making &ndash; are an attempt to take some kind of informal, unstructured process and give it shape.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>I really like this <a href="http://learningforaction.com/articulate-the-hypothesis">Learning for Action page</a>, which I admit I found just by <a href="https://letmegooglethat.com/?q=strategy+articulate+a+hypothesis">googling for &ldquo;strategy articulate a hypotheses&rdquo;</a>. I&rsquo;m less into this <a href="https://www.linkedin.com/pulse/strategy-hypothesis-bryan-whitefield-1c">super corporate-sounding LinkedIn post</a>, but I have to admit I think it&rsquo;s right on the money.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>]]></content></entry><entry><title type="html">Project Goals</title><link href="https://smallcultfollowing.com/babysteps/blog/2023/11/28/project-goals/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2023/11/28/project-goals/</id><published>2023-11-28T00:00:00+00:00</published><updated>2023-11-28T10:43:59-05:00</updated><content type="html"><![CDATA[<p>Lately I&rsquo;ve been iterating on an idea I call <strong>project goals</strong>. <strong>Project goals</strong> are a new kind of RFC that defines a specific goal that a specific group of people hope to achieve in a specific amount of time &ndash; for example, <em>&ldquo;Rusty Spoon Corp proposes to fund 2 engineers full time to stabilize collections that support custom memory allocations by the end of 2023&rdquo;</em>.</p>
<p>Project goals would also include asks from various teams that are needed to complete the goal. For example, <em>&ldquo;Achieving this goal requires a dedicated reviewer from the compiler team along with an agreement from the language design team to respond to RFCs or nominated issues within 2 weeks.&rdquo;</em> The decision of whether to accept a goal would be up to those teams who are being asked to support it. If those teams approve the RFC, it means they agree with the goal, and also that they agree to commit those resources.</p>
<p><strong>My belief is that project goals become a kind of incremental, rolling roadmap, declaring our intent to fix specific problems and then tracking our follow-through (or lack thereof).</strong> As I&rsquo;ll explain in the post, I believe that a mechanism like project goals will help our morale and help us to get shit done, but I also think it&rsquo;ll help with a bunch of other ancillary problems, such as providing a clearer path to get involved in Rust as well as getting more paid maintainers and contributors.</p>
<p>At the moment, project goals are just an idea. My plan is to author some sample goals to iron out the process and then an RFC to make it official.</p>
<h2 id="driving-a-goal-in-the-rust-project-is-an-uncertain-process">Driving a goal in the Rust project is an uncertain process</h2>
<p>Rust today has a lot of half-finished features waiting for people to invest time into them. But figuring out how to do so can be quite intimidating. You may have to trawl through github or Zulip threads to figure out what&rsquo;s going on. Once you&rsquo;ve done that, you&rsquo;ll likely have to work through some competing constraints to find a proposed solution. But that stuff isn&rsquo;t the real problem. The real problem is that, once you&rsquo;ve invested that time and done that work, <strong>you don&rsquo;t really know whether anyone will care enough about your work to approve it</strong>. There&rsquo;s a good chance you&rsquo;ll author an RFC, or a PR, and nobody will even respond to it.</p>
<p>Rust teams today often operate in a fairly reactive mode, without clear priorities. The official Rust procedures are almost exclusively &lsquo;push&rsquo;, and often based on evaluating <em>artifacts</em>, not intentions &ndash; people decide a problem they would like to see solved, and write an RFC or a PR to drive it forward; the teams decide whether to accept that work. But there is no established way to get feedback from the team on whether this is a problem &ndash; or an approach the problem &ndash; that would be welcome. Or, even if the team does theoretically want the work, there is no real promise from the team that they&rsquo;ll respond or accountability when they do not.</p>
<p>We do try to be proactive and talk about our goals. Teams sometimes post lists of aspirations or roadmaps to to Inside Rust, for example, and we used to publish annual roadmaps as a project. But these documents have never seemed very successful to me. <strong>There is a fundamental tension that is peculiar to open source: the teams are not the ones doing the work.</strong> Teams review and provide feedback. Contributors do the work, and ultimately they decide what they will work on (or if they will do work at all). It&rsquo;s hard to plan for the kinds of things you will do when you don&rsquo;t know what resources you have. A more reliable barometer of the Rust project&rsquo;s priorities has been to read the personal blogs doing the work, where people are talking about the goals they personally plan to drive.</p>
<h2 id="this-uncertainty-holds-back-investment">This uncertainty holds back investment</h2>
<p>The uncertainty involved in trying to push an idea forward in Rust is a major deterrent for companies thinking about investing in Rust. I hear about this gap from virtually every angle:</p>
<ul>
<li>Imagine you&rsquo;re a a developer who wants to use paid time to work on open source. How do you convince your manager it makes sense? Right now, the best you can do is I think I can make progress, and besides, it&rsquo;s the right thing to do!&quot;</li>
<li>Imagine you&rsquo;re a contractor who wants to deliver for a client. They want to pay you to help drive a feature over the finish line &ndash; but you can&rsquo;t be sure if you&rsquo;re going to be able to deliver, since it will require consensus from a Rust team, and it&rsquo;s unclear whether it meets their priorities.</li>
<li>Imagine you&rsquo;re a CTO considering whether to adopt Rust for your company. You see that there are gaps in an area, but you don&rsquo;t know whether that is something the project is actively looking to close, or what.</li>
<li>Or maybe you&rsquo;re a CTO who has adopted Rust and is looking to &ldquo;give back&rdquo; to the community by contributing. You want to help deliver support for a feature you need and that you know a lot of people in the community would like, but you can&rsquo;t figure out how to get started, and you can&rsquo;t afford to have an engineer or two work on something for months without a return.</li>
</ul>
<h2 id="but-some-things-work-really-well-and-we-dont-want-to-lose-those">But some things work really well and we don&rsquo;t want to lose those</h2>
<p>Rust&rsquo;s development may be chaotic, but there&rsquo;s a beauty to it as well. As Mara&rsquo;s classic blog post put it, <a href="https://blog.m-ou.se/rust-is-not-a-company/">&ldquo;Rust is not a company&rdquo;</a>. Rust&rsquo;s current structure allows for a feature to make progress in fits and starts, which means we can accommodate all kinds many different interest levels and motivation. Someone who is motivated can author and contribute an RFC, and then disappear. Somebody else can pick up the ball and move the implementation forward. And yet a third person can drive the docs and stabilization over the finish line. This is not only cool to watch, it also means that some features get done that would never be &ldquo;top priority&rdquo;. Consider <code>let-else</code> &ndash; this is one of the most popular features from the last few years, and yet, compared against core enabled like &ldquo;async fn in trait&rdquo;, it clearly takes second place in the priority list. But that&rsquo;s fine, there are plenty of folks who don&rsquo;t have the time or expertise to work on async fn in trait, but they can move <code>let-else</code> forward. <strong>It&rsquo;s really important to me that we don&rsquo;t lose this.</strong></p>
<h2 id="proposal-project-goal-rfcs">Proposal: project goal RFCs</h2>
<p>So, top-down roadmaps are a poor fit for open-source. But working purely bottom-up has its own downsides. What can we do?</p>
<p>My proposal is to form roadmaps, but to do it bottom-up, via a new kind of RFC called a <strong>project goal RFC</strong>. A regular RFC proposes a solution to a problem. A project goal RFC proposes a <strong>plan to solve a particular problem in a particular timeframe</strong>. This could be specific, like <em>&ldquo;stabilize support for async closures in 2024&rdquo;</em>, or it could be more general, like <em>&ldquo;land nightly support for managing resource cleanup in async functions in 2024&rdquo;</em>. What it can&rsquo;t be is non-actionable, such as <em>&ldquo;simplify async programming in 2024&rdquo;</em> or <em>&ldquo;make async Rust nice in 2024&rdquo;</em>.</p>
<p>Project goal RFCs are opened by the <strong>goal owners</strong>, the people proposing to do the work. They are approved by the <strong>teams</strong> which will be responsible for approving that work.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> The RFC serves as a kind of <strong>contract</strong>: the owners will drive the work and the team will review that work and/or provide other kinds of support (such as mentorship).</p>
<h3 id="project-goal-rfcs-are-aimed-squarely-at-larger-projects">Project goal RFCs are aimed squarely at larger projects</h3>
<p>Project goal RFCs are not appropriate for all projects. In fact, they&rsquo;re not appropriate for <em>most</em> projects. They are meant for larger, flagship projects, the kind where you want to be sure that the project is aligned around the goals before you start investing heavily. Here are some examples where I think project goal RFCs would be useful&hellip;</p>
<ul>
<li>The async WG <a href="https://blog.rust-lang.org/inside-rust/2023/05/03/stabilizing-async-fn-in-trait.html">set an &ldquo;unofficial&rdquo; project goal of shipping async functions in traits this year</a> (<a href="https://github.com/rust-lang/rust/pull/115822">coming Dec 28!</a>). Honestly, setting a goal like this felt a bit uncomfortable, as we didn&rsquo;t have a means to make it &ldquo;official and blessed&rdquo;. I think that would have also helped  during the push to stabilization, since we could reference this goal to help make the case for &ldquo;time to ship&rdquo;.</li>
<li>Goals might also take the shape of internal improvements. The types team is driving a flagship goal to ship a new trait solver. Authoring a project goal RFC would help bring this visibility and would also make it easier to make the case for funding work on this project.</li>
<li>I sometimes help to mentor collaborations with people in universities or with Master&rsquo;s students. Project goals would let us set expectations up front about what work we expect to do during that time.</li>
<li>I&rsquo;d like to drive consensus around the idea of <a href="https://smallcultfollowing.com/babysteps/blog/2023/09/30/profiles/">easing tradeoffs with profiles</a> &ndash; but I don&rsquo;t want to start off with an RFC that is going to focus discuss on the details of how profiles are specified. I want to start off by getting alignment around whether to do something like profiles at all. Wearing my Amazon manager hat, having alignment there would also influence whether I allocated some of our team&rsquo;s bandwidth to work on that. A project goal could be perfect for that.</li>
<li>The Foundation has run several project grant programs, and one of the challenges has been trying to choose projects to fund which will be welcomed by the project. As I&rsquo;ve been saying, we don&rsquo;t really have a mechanism for making those sorts of decisions.</li>
<li>The embedded working group or the Rust For Linux folks have a bunch of pain points. I think it&rsquo;s been hard for us to manage cooperation between those really important efforts and the other Rust teams. Developing a joint project goal would be a way to highlight needs.</li>
<li>Someone who wants to work on Rust at their company could work with a team to develop an official goal that they can show to their manager to get authorized work time.</li>
<li>Companies that want to invest in Rust to close gaps could propose project goals. For example, I frequently get asked how a company can help move custom allocators forward. One candidate that comes up a lot is support for custom allocators and collections with fallible allocation. This same mechanism would also allow larger companies to propose goals that they&rsquo;d like to drive. For example, there was a <a href="https://rust-lang.github.io/rfcs/3191-debugger-visualizer.html">recent RFC on debugger visualization aimed at better support for debugging Rust in Windows</a>. I could imagine folks from Microsoft proposing some goals in that area.</li>
</ul>
<h2 id="anatomy-of-a-project-goal-rfc">Anatomy of a project goal RFC</h2>
<p>Project goal RFCs need to include enough detail that both the owners and the teams know what they are signing up for. I believe a project goal RFC should answer the following questions:</p>
<ul>
<li><strong>Why</strong> is this work important?</li>
<li><strong>What</strong> work will be done on what <strong>timeframe</strong>?
<ul>
<li>This should include&hellip;
<ul>
<li><strong>milestones</strong> you will meet along the way,</li>
<li><strong>specific use-cases</strong> you plan to address,</li>
<li>and <strong>guiding principles</strong> that will be used during design.</li>
</ul>
</li>
</ul>
</li>
<li><strong>Who</strong> will be doing the work, and how much time will the have?</li>
<li>What <strong>support</strong> is needed and from which Rust teams?</li>
</ul>
<p>The list above is intentionally somewhat detailed. <strong>Project goal RFCs are not meant to be used for everything.</strong> They are meant to be used for goals that are big enough that doing the planning is worthwhile. The planning also helps the owners and the teams set realistic timelines. (My assumption is that the first few project goals we set will be wildly optimistic, and over time we learn to temper our expectations.)</p>
<h3 id="why-is-this-work-important"><strong>Why</strong> is this work important?</h3>
<p>Naturally whenever we propose to do something, it is important to explain <strong>why</strong> this thing is worth doing. A quality project goal will lay out the context and motivation. The goal is for the owners to explain to the team why the team should dedicate their maintenance bandwidth to this feature. It&rsquo;s also a space for the owners to explain to the world why they feel it&rsquo;s worth their time to do the work to develop this feature.</p>
<h3 id="what-will-be-done-and-on-what-timeframe">What will be done and on what timeframe?</h3>
<p>The heart of the project goal is declaring what work is to be done and when it will be done by. It&rsquo;s important that this &ldquo;work to be done&rdquo; is specific enough to be evaluated. For example, <em>&ldquo;make async nice next year&rdquo;</em> is not a good goal. Something like <em>&ldquo;stabilize async closures in 2024&rdquo;</em> is good. It&rsquo;s also ok to just talk about the problem to be solved, if the best solution isn&rsquo;t known yet. For example, <em>&ldquo;deliver nightly support for managing resource cleanup in async programs in 2025&rdquo;</em> is a good goal that could be solved by [&ldquo;async drop&rdquo;][] but also by some other means.</p>
<h4 id="scaling-work-with-timeframes-and-milestones">Scaling work with timeframes and milestones</h4>
<p>Goals should always include a <strong>specific timeframe</strong>, such as &ldquo;in 2024&rdquo; or &ldquo;in 2025&rdquo;. I think these timeframes will typically be about a year. If the time is too short, then the work is probably not significant enough to call it a goal. But if the timeframe is much longer than a year, then it&rsquo;s probably best to scale back the &ldquo;work to be done&rdquo; to something more intermediate.</p>
<p>Of course, many goals will be part of a bigger project. For example, if one took a goal to deliver nightly support for something in 2024, then the next year, one might propose a goal to stabilize that support.</p>
<p>Ideally, the goal will also include <strong>milestones</strong> along the way. For example, if the goal is to have something stable in 1 year, it might begin with an RFC after 3 months, then 3 months of impl, 3 months of gaining experience, and 3 months for stabilization.</p>
<h4 id="pinning-things-down-with-use-cases">Pinning things down with use-cases</h4>
<p>Unlike a feature RFC, a project goal RFC does not specify a precise design for the feature in question. Even if the project goal is something relatively specific, like &ldquo;add support for async functions in traits&rdquo;, there will still be a lot of ambiguity about what counts as success. For example, we decided to stabilize async functions in traits without support for <a href="https://smallcultfollowing.com/babysteps/blog/2023/02/01/async-trait-send-bounds-part-1-intro/">send bounds</a>. This means that some use cases, notably a crate like <a href="https://crates.io/crates/tower">tower</a>, aren&rsquo;t supported yet. Does this count as success? To help pin this down, the project goal should include a list of use cases that it is trying to address.</p>
<h4 id="establishing-guiding-principles-early">Establishing guiding principles early</h4>
<p>Finally, especially when goals involve a fair bit of design leeway, it is useful to lay down some of the guiding principles the goal owners expect to use. I think having discussion around these principles early will really help focus discussions later on. For example, when discussing how dynamic dispatch for async functions in traits should work, Tyler Mandry and I had an <a href="https://smallcultfollowing.com/babysteps/blog/2021/09/30/dyn-async-traits-part-1/">early goal that it should &ldquo;just work&rdquo; for simple cases</a> but give the ability to customize behavior. <a href="https://smallcultfollowing.com/babysteps/blog/2022/09/18/dyn-async-traits-part-8-the-soul-of-rust/">But we quickly found that ran smack into Josh&rsquo;s prioritization of allocation transparency.</a> This conflict was precictable and I think it would have been useful to have had the discussion around these tenets early as a lang team, rather than waiting.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></p>
<h3 id="who-will-be-doing-the-work-and-how-much-time-will-the-have"><strong>Who</strong> will be doing the work, and how much time will the have?</h3>
<p>Part of the goal is specifying who is going to be doing the work. For example, the goal might say &ldquo;two developers to work at 50% time&rdquo;. It might also say something more flexible, like &ldquo;one developer to create quest issues and then mentor a group of volunteers to drive most of the work&rdquo;. If possible, including specific names is useful too, particularly in more specialized areas. For example, &ldquo;Ralf Jung and one graduate student will pursue an official set of rules for stacked borrows&rdquo;.</p>
<h3 id="what-support-is-needed-and-from-which-rust-teams">What <strong>support</strong> is needed and from which Rust teams?</h3>
<p>This section is where the project goal owners make asks of the project. Here are some typical asks that I expect we will have:</p>
<ul>
<li>A dedicated reviewer for PRs to the compiler and an expected <a href="https://en.wikipedia.org/wiki/Service-level_agreement">SLA</a> of reviews within 3 days (or 1 week, or something).</li>
<li>An agreement from the lang team to review and provide feedback on RFCs.</li>
<li>Mentorship on some aspect or other.</li>
</ul>
<p>I think teams should suggest the expected shape of asks and track their resources. For example, the lang team can probably have manage up to only a small number of &ldquo;prioritized RFCs&rdquo; at a time, so if there are more project goals, they may have to wait or accept a lower SLA.</p>
<h2 id="tracking-progress">Tracking progress</h2>
<p>One of the interesting things about project goals is that they give us an immediate roadmap. I would like to see the project author a quarterly report &ndash; which means every 12 weeks, or two release cycles. This report would include all the current project goals and updates on their progress. Did they make their declared milestones? If not, why not? Because project goals don&rsquo;t cover the entirety of the work we do, the report could also include other significant developments. This would be published on the main Rust blog and would let people follow along with Rust development and get a sense for our current trajectory.</p>
<p>One thing I&rsquo;ve learned, though: <strong>you can&rsquo;t require the goal owners to author that blog post</strong>. It would be much better to have a dedicated person or team authoring the blog posts and pinging the goal owners to get those status updates. Preparing an update so that it can be understood by a mass audience is its own sort of skill. Moreover, goal owners will be tempted to put it off, and the updates won&rsquo;t happen. I think it&rsquo;s quite important that these project updates happen every quarter, like clockwork, just as our Rust releases do. This is true even if the update has to ship without an update from some goals.</p>
<p>I envision this progress tracking as providing a measure of accountability. When somebody takes a goal, we&rsquo;ll be able to follow along with their progress. I&rsquo;ve seen at Amazon and elsewhere that having written down a goal and declared milestones, and then having to say whether you&rsquo;ve met them, helps to keep teams focused on getting the job done. I often find that I have a job about 95% done but then, in the week before I have to write an update about it, I&rsquo;m inspired to go and finish that last 5%.</p>
<h2 id="conclusion-next-steps">Conclusion: next steps</h2>
<p>My next step is that I am going to fashion an RFC making the case for project goals. This RFC will include a template. To try out the idea, I plan to also author an example project goal for &ldquo;async function in traits&rdquo; and perhaps some other ongoing or proposed efforts. In truth, I don&rsquo;t think we <em>need</em> an RFC to do project goals &ndash; nothing is stopping us from accepting whatever RFC we want &ndash; but I see some value in spelling out and legitimizing the process. I think this probably ought to be approved by the governance council, which is an interesting test for that new group.</p>
<p>There are some follow-up questions worth discussing. One of the ones I think is most interesting is how to manage the quarterly project updates. This deserves a post of its own. The short version of my opinion is that I think it&rsquo;d be great to have an open source &ldquo;reporting&rdquo; team that has the job of authoring this update and others of its ilk. I suspect that this team would work best if we had one or more people paid to participate and to bear the brunt of some of the organizational lift. I further suspect that the Foundation would be a good place for at least one of those people. But this is getting pretty speculative by now and I&rsquo;d have to make the case to the board and Rust community that it&rsquo;s a good use for the Foundation budget, which I certainly have not done.</p>
<p>It&rsquo;s worth noting that I see project goal RFCs as just one piece of a larger puzzle that is giving a bit more structure to our design effort. One thing I think went wrong in prior efforts was that we attemped to be too proscriptive and too &ldquo;one size fits all&rdquo;. These days I tend to think that the only thing we <em>must have</em> to add a new feature to stable is an FCP-binding decision from the relevant teams(s). All the rest, whether it be authoring a feature RFC or creating a project goal RFC, are steps that make sense for projects of a certain magnitude, but not everything. Our job then should be to lay out the various kinds of RFCs one can write and when they are appropriate for use, and then let the teams judge how and when to request one.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>In theory, anyway. In practice, I imagine that many team maintainers may keep some draft project goal RFCs in their pocket, looking for someone willing to do the work.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>The question of how to make <code>dyn</code> async traits easy to use <em>and</em> transparent remains unresolved, which is partly why I&rsquo;m keen on something like <a href="https://smallcultfollowing.com/babysteps/blog/2023/09/30/profiles/">profiles</a>.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>]]></content></entry><entry><title type="html">Idea: "Using Rust", a living document</title><link href="https://smallcultfollowing.com/babysteps/blog/2023/10/20/using-rust/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2023/10/20/using-rust/</id><published>2023-10-20T00:00:00+00:00</published><updated>2023-10-20T14:29:21-04:00</updated><content type="html"><![CDATA[<p>A few years back, the Async Wg tried something new. We collaboratively authored an <a href="https://rust-lang.github.io/wg-async/vision">Async Vision Doc</a>. The doc began by writing <a href="https://rust-lang.github.io/wg-async/vision/submitted_stories/status_quo.html">&ldquo;status quo&rdquo; stories</a>, written as narratives from our <a href="https://rust-lang.github.io/wg-async/vision/characters.html">cast of characters</a>, that described how people were experiencing Async Rust at that time and then went on to plan a <a href="https://rust-lang.github.io/wg-async/vision/shiny_future.html">&ldquo;shiny future&rdquo;</a>. This was a great experience. My impression was that authoring the &ldquo;status quo&rdquo; stories <em>in particular</em> was really helpful. Discussions at EuroRust recently got me wondering: <strong>can we adapt the &ldquo;status quo&rdquo; stories to something bigger?</strong> What if we could author a living document on the Rust user experience? One that captures what people are trying to do with Rust, where it is working really well for them, and where it could use improvement. I love this idea, and the more I thought about it, the more I saw opportunities to use it to improve other processes, such as planning, public communication, and RFCs. But I&rsquo;m getting ahead of myself! Let&rsquo;s dive in.</p>
<h2 id="tldr">TL;DR</h2>
<p>I think authoring a living document (working title: &ldquo;Using Rust&rdquo;) that collects <a href="https://rust-lang.github.io/wg-async/vision/submitted_stories/status_quo.html">&ldquo;status quo&rdquo; stories</a> could be a tremendous resource for the Rust community. I&rsquo;m curious to <a href="mailto:rust@nikomatsakis.com">hear from</a> folks who might like to be part of a group authoring such a document, especially (but not only) people with experience as product managers, developer advocates, or UX researchers.</p>
<h2 id="open-source-is-full-of-ideas-but-which-to-do">Open source is full of ideas, but which to do?</h2>
<p>The Rust open-source organization is a raucuous, chaotic, and, at its best, joyful environment. People are bubbling with ideas on how to make things better (some better than others). There are also a ton of people who want to be involved, but don&rsquo;t know what to do. This sounds great, but it presents a real challenge: <strong>how do you decide which ideas to do?</strong></p>
<p>The vast majority of ideas for improvement tend to be incremental. They take some small problem and polish it. If I sound disparaging, I don&rsquo;t mean to be. This kind of polish is <strong>absolutely essential</strong>. It&rsquo;s kind of ironic: there&rsquo;s always been a perception that open source can&rsquo;t build a quality product, but my experience has often been the opposite. Open source means that people show up out of nowhere with PRs that remove sharp edges. Sometimes it&rsquo;s an edge you knew was there but didn&rsquo;t have time to fix; other times it&rsquo;s a problem you weren&rsquo;t aware of, perhaps because of the <a href="https://en.wikipedia.org/wiki/Curse_of_knowledge">Curse of Knowledge</a>.</p>
<p>But finding those <strong>revolutionary</strong> ideas is harder. To be clear, it&rsquo;s hard in any environment, but I think it&rsquo;s particularly hard in open source. A big part of the problem is that open source has always focused on <strong>coding</strong> as our basic currency. Discussions tend to orient around specific proposals &ndash; that could be as small as a PR or as large as an RFC. But finding a revolutionary idea doesn&rsquo;t start from coding or from a specific idea.</p>
<h2 id="it-all-starts-with-the-status-quo">It all starts with the &ldquo;status quo&rdquo;</h2>
<p>So how do we go about having more &ldquo;revolutionary ideas&rdquo;? My experience is that it begins by <strong>deeply understandly understanding the present moment</strong>. It&rsquo;s amazing how often we take the &ldquo;status quo&rdquo; for granted. We assume that we know the problems people experience, and we assume that everybody else knows them too. In reality, we only know the problems that we <em>personally</em> experience &ndash; and most of the time we are not even fully aware of those!</p>
<p>One thing <a href="https://smallcultfollowing.com/babysteps/blog/2021/05/01/aic-vision-docs/#start-with-the-status-quo">I remember from authoring the async vision doc</a> is <strong>how hard it was to focus on the &ldquo;status quo&rdquo;</strong> &ndash; and how rewarding it was when we did! When you get people talking about the problems they experience, the temptation is to <em>immediately</em> jump to how to fix the problem. But if you resist that, and you force yourself to just document the current state, you&rsquo;ll find you have a much richer idea of the problem.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> And that richer understanding, in turn, gives rise to better ideas for how to fix it.</p>
<h2 id="idea-a-living-using-rust-document">Idea: a living &ldquo;Using Rust&rdquo; document</h2>
<p>So here is my idea: what if we created a living document, working title &ldquo;Using Rust&rdquo;, that aims to capture the &ldquo;status quo&rdquo; of Rust today:</p>
<ul>
<li>What are people building with Rust?</li>
<li>How are people&rsquo;s Rust experiences influenced by their background (e.g., prior programming experience, native language, etc)?</li>
<li>What is working well?</li>
<li>What challenges are they encountering?</li>
</ul>
<p>Just as with the Async Vision Doc, I imagine &ldquo;Using Rust&rdquo; would cover the whole gamut of experiences, including not just the language itself but tooling, libraries, etc. Unlike the vision doc, I wouldn&rsquo;t narrow it to async (though we might start by focusing on a particular domain to prove out the idea).</p>
<p>Like the vision doc, I imagine &ldquo;Using Rust&rdquo; would be composed of a series of vignettes, expressed in narrative form, using a similar <a href="https://rust-lang.github.io/wg-async/vision/characters.html">set of personas</a><sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> to the Async Vision Doc (perhaps with variations, like Spanish-speaking Alano instead of Alan).</p>
<p>I personally found the narratives really helpful to get the emotional &ldquo;heft&rdquo; of some of the stories. For example, <a href="https://rust-lang.github.io/wg-async/vision/submitted_stories/status_quo/alan_started_trusting_the_rust_compiler_but_then_async.html">&ldquo;Alan started trusting the Rust compiler, but then&hellip; async&rdquo;</a> helped drive home the importance of that &ldquo;if it compiles, it works&rdquo; feeling for Rust users, as well as the way that panics can undermine it. Even though these are narratives, they can still dive deep into technical details. Researching and writing <a href="https://rust-lang.github.io/wg-async/vision/submitted_stories/status_quo/barbara_battles_buffered_streams.html">&ldquo;Barbara battles buffered streams&rdquo;</a>, for example, really helped me to appreciate the trickiness of async cancellation&rsquo;s semantics.<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></p>
<p>I don&rsquo;t think &ldquo;Using Rust&rdquo; would ever be finished, nor would I narrow it to one domain. Rather, I imagine it being a living document, one that we continuously revise as Rust changes.</p>
<h2 id="improving-on-the-async-vision-doc">Improving on the async vision doc</h2>
<p>The async vision doc experience was great, but I learned a few things along the way that I would do differently now. One of them is that <strong>collecting stories is good, but synthesizing them is better</strong> (and harder). I also found that <strong>people telling you the stories are not always the right ones to author them</strong>. Last time, we had a lot of success with people authoring PRs, but many times people would tell a story, agree to author a PR, and then never follow up. This is pretty standard for open source but it also applies a sort of &ldquo;selection bias&rdquo; to the stories we got. <strong>I would address both of these problems by dividing up the roles.</strong> Rust users would just have to tell their stories. There would be a group of maintainers who would record those stories and then go try to author the PRs that integrate into &ldquo;Using Rust&rdquo;.</p>
<p>The other thing I learned is that trying to <strong>author a single shiny future does not work</strong>. It was meant to be a unifying vision for the group, but there are just too many variables at play to reach consensus on that. We should <strong>definitely</strong> be talking about where we will be in 5 years, but we don&rsquo;t have to be entirely aligned on it. We just have to agree on the right next steps. <strong>My new plan is to integrate the &ldquo;shiny future&rdquo; into RFCs, as I describe below.</strong></p>
<h2 id="maintaining-using-rust">Maintaining &ldquo;Using Rust&rdquo;</h2>
<p>In the fullness of time, and presuming it works out well, I think &ldquo;Using Rust&rdquo; should be a rust-lang project, owned and maintained by its own team. My working title for this team is the <em>User Research Team</em>, which has the charter of gathering up data on how people use Rust and putting that data into a form that makes it accessible to the rest of the Rust project. But I tend to think it&rsquo;s better to prove out ideas before creating the team, so I think I would start with an experimental project, and create the team once we demonstrate the concept is working.</p>
<h2 id="gathering-stories">Gathering stories</h2>
<p>So how would this team go about gathering data? There&rsquo;s so many ways. When doing the async vision doc, we got some stories submitted by PRs on the repo. We ran <a href="https://smallcultfollowing.com/babysteps/blog/2021/03/22/async-vision-doc-writing-sessions/">writing sessions</a> where people would come and tell us about their experiences.</p>
<p>I think it&rsquo;s very valuable to have people gather &ldquo;in depth&rdquo; data from within specific companies. For the Async Vision Doc, I also interviewed team members, culminating in the &ldquo;meta-story&rdquo; <a href="https://rust-lang.github.io/wg-async/vision/submitted_stories/status_quo/aws_engineer.html">&ldquo;Alan extends an AWS service&rdquo;</a>. Tyler Mandry and I also met with members from Google, and I recall we had folks from Embark and a few other companies reach out to tell us about their experiences.</p>
<p>Another really cool idea that came from Pietro Albini: set up a booth at various Rust conferences where people can come up and tell you about their stories. Or perhaps we can run a workshop. So many possibilities!</p>
<h2 id="integrating-using-rust-with-the-rfc-process">Integrating &ldquo;Using Rust&rdquo; with the RFC process</h2>
<p>The purpose of an RFC, in my mind, is to lay out a problem and a specific solution to that problem. The RFC is not code. It doesn&rsquo;t have to be a complete description of the problem. But it should be complete enough that people can imagine how the problem is going to be solved.</p>
<p>Every RFC includes a motivation, but when I read those motivations, I am often a bit at a loss as to how to evaluate them. Clearly there is some kind of problem. But is it important? How does it rank with respect to other problems that users are encountering?</p>
<p>I imagine that the &ldquo;Using Rust&rdquo; doc would help greatly here. I&rsquo;d like to get to the point where the moivation for RFCs is primarily addressing particular stories or aspects of stories within the document. We would then be able to read over other related stories to get a sense for how this problem ranks compared to other problems for that audience, and thus how important the motivation is.</p>
<p>RFCs can also include a section that &ldquo;retells&rdquo; the story to explain how it would have played out had this feature been available. I&rsquo;ve often found that doing this helps me to identify obvious gaps. For example, maybe we are adding a nifty new syntax to address an issue, but how will users learn about it? Perhaps we can add a &ldquo;note&rdquo; to the diagnostic to guide them.</p>
<h2 id="frequently-asked-questions">Frequently asked questions</h2>
<h3 id="will-this-help-us-in-cross-team-collaboration">Will this help us in cross-team collaboration?</h3>
<p>Like any organization, the Rust organization can easily wind up &ldquo;shipping its org chart&rdquo;. For example, if I see a problem, as a lang-team member, I may be inclined to ship a language-based solution for it; similarly, I&rsquo;ve seen that the embedded community works very hard to work within the confines of Rust as it is, whereas sometimes they could be a lot more productive if we added something to the language.</p>
<p>Although they are not a complete solution, I think having a &ldquo;Using Rust&rdquo; document will be helpful. Focusing on describing the problem means it can be presented to multiple teams and each can evaluate it to decide where the best solution lies.</p>
<h3 id="what-about-other-kinds-of-stories">What about other kinds of stories?</h3>
<p>I&rsquo;ve focused on stories about Rust users, but I think there are other kinds of stories we might want to include. For example, what about the trials and travails of <a href="https://rust-lang.github.io/wg-async/vision/characters.html">Alan, Barbara, Grace, and Niklaus</a> as they try to contribute to Rust?</p>
<h3 id="how-will-we-avoid-scenario-solving">How will we avoid &ldquo;scenario solving&rdquo;?</h3>
<p>Scenario solving refers to a pattern where a feature is made to target various specific examples rather than being generalized to address a pattern of problems. It&rsquo;s possible that if we write out user stories, people will design features to target <em>exactly</em> the problems that they read about, rather than observing that a whole host of problems can be addressed via a single solution. That is true, and I think teams will want to watch out for that. At the same time, I think that having access to a full range of stories will make it much easier to <em>see</em> those large patterns and to help identify the full value for a proposal.</p>
<h3 id="what-about-a-project-management-team">What about a project management team?</h3>
<p>From time to time there are proposals to create a &ldquo;project management&rdquo; team. There are many different shapes for what such a team would do, but the high-level motivation is to help provide &ldquo;overall guidance&rdquo; and ensure coherence between the Rust teams. I am skeptical about any idea that sounds like an &ldquo;overseer&rdquo; team. I trust the Rust teams to own and maintain their area. But I do think we can all benefit from getting more alignment on the sets of problems to be solved, which I think this &ldquo;Using Rust&rdquo; document would help to create. I can also imagine other interesting mechanisms that build on the doc, such as reviewing stories as a group online, or at &ldquo;unconferences&rdquo;.</p>
<h2 id="call-to-action-get-in-touch">Call to action: get in touch!</h2>
<p>I&rsquo;m feeling pretty excited about this project. I&rsquo;m contemplating how to go about organizing it. I&rsquo;m really interested to hear from people who would like to take part as authors and collators of user stories. If you think you&rsquo;d be interested to participate, please <a href="mailto:rust@nikomatsakis.com">send me an email</a>. I&rsquo;m particularly interested to hear from people with experience doing this sort of work (e.g., product managers, developer advocates, UX researchers).</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>If you&rsquo;re hearing resonance of the wisdom of the Buddha, it was not intentional when I wrote this, but you are not alone.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>The personas/characters may look simple, but developing that cast of characters took a lot of work. Finding a set that is small enough to be memorable but which captures the essentials is hard work. One key insight was separating out the <a href="https://rust-lang.github.io/wg-async/vision/projects.html">projects people are building</a> from the characters building them, since otherwise you get a combinatorial explosion.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Async cancellation is an area I deseparately want to return to! I still think we want some kind of structured concurrency like solution. My current thinking is roughly that we want something like <a href="https://github.com/nikomatsakis/moro/">moro</a> for task-based concurrency and something like Yosh&rsquo;s <a href="https://blog.yoshuawuyts.com/futures-concurrency-3/#concurrent-stream-processing-with-stream-merge">merged streams</a> for handling &ldquo;expect one of many possible message&rdquo;-like scenarios.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><title type="html">Eurorust reflections</title><link href="https://smallcultfollowing.com/babysteps/blog/2023/10/14/eurorust-reflections/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2023/10/14/eurorust-reflections/</id><published>2023-10-14T00:00:00+00:00</published><updated>2023-10-14T12:47:05-04:00</updated><content type="html"><![CDATA[<p>I’m on the plane back to the US from Belgium now and feeling grateful for having had the chance to speak at the <a href="https://eurorust.eu">EuroRust conference</a><sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>. EuroRust was the first Rust-focused conference that I’ve attended since COVID (though not the first conference overall). It was also the first Rust-focused conference that I’ve attended in Europe since…ever, from what I recall.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> Since many of us were going to be in attendance, the types team also organized an in-person meetup which took place for 3 days before the conference itself<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>. Both the meetup and the conference were great in many ways, and sparked a lot of ideas. I think I’ll be writing blog posts about them for weeks to come, but I thought that to start, I’d write up something general about the conference itself, and some of my takeaways from the experience</p>
<h3 id="its-great-to-talk-to-people-using-rust">It’s great to talk to people <strong>using</strong> Rust</h3>
<p>When I started on Rust, I figured the project was never going to go anywhere — I mean, come on, we were making a new programming language. What are the odds it’ll be a success? But it still seemed like fun. So I set myself a simple benchmark: I will consider the project a success the first time I see an announcement where somebody built something cool with it, and I didn’t know them beforehand. In those days, everybody using Rust was also hanging out on IRC or on the mailing list.</p>
<p>Well, that turned out to be a touch on the conservative side. These days, Rust has gotten big enough that the core project itself is just a small piece of the action. It’s just amazing to hear all the things people are using Rust for. Just looking at the conference sponsors alone, I loved meeting the <a href="https://www.shuttle.rs/">Shuttle</a> and <a href="https://tauri.app">Tauri</a>/<a href="https://crabnebula.dev/">CrabNebula</a> teams and I got excited about playing with both of them. I had a great time <a href="https://twitter.com/rustrover/status/1712461642666320369">talking to the RustRover team</a> about the possibilities for building custom diagnostics and the ways we could leverage their custom GUI to finally get past the limitations of the terminal when we present error messages. But one of my favorite parts happened on the tram ride home, when I randomly met the maintainer of <a href="https://pyo3.rs">PyO3</a>. Such a cool project, and definite inspiration for work I’ve been doing lately, like <a href="https://duchess-rs.github.io/duchess">duchess</a>.</p>
<h3 id="rust-teachers-everywhere">Rust teachers everywhere</h3>
<p>Speaking of <a href="https://www.shuttle.rs/">Shuttle</a> and <a href="https://tauri.app">Tauri</a>, both of them are interesting in a particular way: they are empowerment efforts in their own right, and so they attract people whose primary interest is not Rust itself, but rather achieving some other goal (e.g., cloud development, or building a GUI application). It&rsquo;s cool to see Rust empowering people to build other empowerment apps, but it&rsquo;s <em>also</em> a fascinating source of data. Both of those projects have started embarking on efforts to teach Rust precisely because that will help grow their userbase. The Shuttle blog has all kinds of interesting articles<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup>; the Tauri folks told me about their efforts to build Rust articles specifically targeting JavaScript and TypeScript programmers, which required careful choice of terminology and concepts.</p>
<h3 id="the-whole-rustfest-idea-seems-to-have-really-worked">The whole RustFest idea seems to have really worked</h3>
<p>At some point, RustFest morphed from a particular conference into a kind of ‘meta conference’ organization, helping others to organize and run their own events. Looking over the calendar of Rust events in Europe, I have to say, that looks like it’s worked out pretty dang well. Hats off to y’all on that. Between <a href="https://eurorust.eu">EuroRust</a>, <a href="https://rustlab.it/">RustLab in Italy</a>, <a href="https://www.rustnationuk.com/">Rust Nation</a> in the UK, and probably a bunch more that I’m not aware of.</p>
<p>I should also say that meeting the conference <em>organizers</em> at this conference was very nice. Both the EuroRust organizers (Marco and Sarah, from <a href="https://mainmatter.com/">Mainmatter</a>) were great to talk to, and I finally got to meet <a href="https://github.com/ernestkissiedu">Ernest</a> (now organizing Rust Nation in the UK), whom I’ve talked to on and off over the years but never met in person.</p>
<p>I do still miss the cozy chats at <a href="https://www.rust-belt-rust.com/">Rust Belt Rust</a> (RIP), but this new generation of Rust conferences (and their organizers) is pretty rad too. Plus I get to eat good cheese and drink beer outdoors, two things that for reasons unbeknownst to me are all too rare in the United States.</p>
<h3 id="the-kids-are-all-right">The kids are all right</h3>
<p>One of my favorite things about being involved in the Rust project has been watching it sustain and reinvent itself over the years. This year at the conference I got to see the “new generation” of Rust maintainers and contributors — some of them, like @davidtwco, I had met before, but who have gone from “wanna be” Rust contributor to driving core initiatives like the <a href="https://blog.rust-lang.org/inside-rust/2022/08/16/diagnostic-effort.html">diagnostic translation effort</a>. Others — like @bjorn3, @WaffleLapkin, @Nilstrieb, and even @MaraBos — I had never had a chance to meet before. I love that working on Rust lets you interact with people from all other the world, but there’s nothing like putting a name to a face, and getting to give someone a hug or shake their hand.</p>
<h3 id="but-yeah-theres-that-thing">But yeah, there’s that thing</h3>
<p>So, let me say up front, due to scheduling conflicts, I wasn’t able to attend RustConf this year (or last year, as it happens). But I read <a href="https://blog.adamchalmers.com/rustconf-2023-recap/">Adam Chalmer&rsquo;s blog post</a> that many people were talking about, and I saw this paragraph…</p>
<blockquote>
<p><strong>Rustconf definitely felt sadder and downbeat than my previous visit.</strong> Rustconf 2019 felt jubilant. The opening keynote celebrated the many exciting things that had happened over the last year. Non-lexical lifetimes had just shipped, which removed a ton of confusing borrow checker edge cases. Async/await was just a few short months away from being stabilized, unleashing a lot of high-performance, massively-scalable software. Eliza Weisman was presenting a new async tracing library which soon took over the Rust ecosystem. Lin Clark presented about how you could actually compile Rust into this niche thing called WebAssembly and get Rust to run on the frontend &ndash; awesome! <strong>It felt like Rust had a clear vision and was rapidly achieving its goals. I was super excited to be part of this revolution in software engineering.</strong></p>
</blockquote>
<p>…and it made me feel really sad.<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup> Rust’s mission has always been empowerment. I’ve always loved the “can do” spirit of Rust, the way we aim high and try to push boundaries in every way we can. <strong>To me, the open source org has always been an important part of how we empower.</strong></p>
<p>Developing a programming language, especially a compiled one, is often viewed as the work of “wizards”, just like systems programming. I think Rust proves that this “wizard-like” reputation has more to do with the limitations of the tools we were using than the task itself. But just like Rust has the goal of making systems programming more practical and accessible, I like to think <em>the Rust org</em> helps to open up language development to a wider audience. I’ve seen so many people come to Rust, full of enthusiasm but not so much experience, and use it to launch a new career.</p>
<p>But, if I’m honest, I’ve also seen a lot of people come into Rust full of enthusiasm and wind up burned out and frustrated. And sometimes I think that’s precisely <em>because</em> of our “sky’s the limit” attitude — sometimes we can get so ambitious, we set ourselves up to crash and burn.</p>
<h3 id="sometimes-thinking-big-means-getting-nowhere">Sometimes “thinking big” means getting nowhere</h3>
<p>Everybody wants to “think big”. And Rust has always prided itself on taking a “holistic view” of problems — we’ve tried to pay attention to the whole project, not just generating good code, but targeting the whole experience with quality diagnostics, a build system, an easy way to manage which Rust version you want, a package ecosystem, etc. But when we look at all the stuff we’ve built, it’s easy to forget how we got there: incrementally and painfully.</p>
<p>I mean, in Ye Olde Days of Rust, we didn’t even have a borrow checker. Soundness was an aspiration, not a reality. And once we got one, it sucked to use, because the design was still stuck in some ‘old style’ thinking. And even once we had INHTWAMA<sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup>, the error messages were pretty confounding. And once we <a href="https://blog.rust-lang.org/2016/08/10/Shape-of-errors-to-come.html">invented the idea of multiline errors</a>, it wasn’t until late 2018 that we had <a href="https://blog.rust-lang.org/2018/12/06/Rust-1.31-and-rust-2018.html#non-lexical-lifetimes">NLL</a>, which changed the game again. And that’s just the compiler! The story is pretty much the same for every other detail of the language. You used to have to build the compiler with a Makefile that was so complex, I wouldn’t be surprised if were self-aware.<sup id="fnref:7"><a href="#fn:7" class="footnote-ref" role="doc-noteref">7</a></sup></p>
<p><strong>When I feel burned out, one of the biggest reasons is that I&rsquo;ve fallen into the trap of thinking too big, doing too much, and as a result I am spread too thin and everything seems impossible.</strong> Just look back three years ago: the async working group was driving this crazy project, <a href="https://blog.rust-lang.org/2021/03/18/async-vision-doc.html">the Async Vision Doc</a>, and it seemed like we were on top of the world. We recorded all these stories of how async Rust was hard, and we were thinking about how we could solve it. Not surprisingly, we found that these stories were sometimes language problems, but just as often they were library limitations, or gaps in the tooling, or the docs. And so we set out an <a href="https://rust-lang.github.io/wg-async/vision/roadmap.html">expansive vision, spawning out a ton of subprojects</a>. And all the time, there was a voice in my head saying, “is this really going to work?”</p>
<p>Well, I’d say the answer is “no”. I mean, we made a lot of progress. We are going to stabilize async functions in traits this year, and that is <strong>awesome</strong>. We made a bunch of improvements to async usability, most notably cjgillot’s fantastic PR that improves the accuracy of send bounds and futures, preventing a whole ton of false errors (though that work wasn’t really done in coordination with the async wg effort per se, it’s just because cjgillot is out there silently making huge refactors<sup id="fnref:8"><a href="#fn:8" class="footnote-ref" role="doc-noteref">8</a></sup>).</p>
<p>And yet, there’s a lot we didn’t do. We don’t have generators. We didn’t yet find a way to make futures smaller. We didn’t really drive to ground the conversation on structured concurrency. We also took a lot <em>longer</em> to do stuff than I hoped. I thought async functions in traits would ship in 2021 — it’s shipping now, but it’s 2023.</p>
<h3 id="focus-focus-focus-iterate-iterate-iterate">Focus, focus, focus; iterate, iterate, iterate</h3>
<p>One lesson I take away from the async wg experience is focus, focus, focus and iterate, iterate, iterate. <strong>You can (almost) never start too small.</strong> I think we were absolutely right that “doing async right” demands addressing all of those concerns, but I think that we overestimated our ability to coordinate them up front, and as a result, things like shipping async fn in traits took longer than they needed to. We <em>are</em> going to get the async shiny future, but we’re going to get it one step at a time.</p>
<h3 id="also-were-a-lot-bigger-than-we-used-to">Also: we’re a lot bigger than we used to</h3>
<p>Still, sometimes I find that when I float ideas, I encounter a reflexive bit of pushback: <em>“sounds great, who’s going to do it”</em>. One the one hand, that’s the voice of experience, coming back from one too many Think Big plans that didn’t work out. But on the other, sometimes it feels a bit like “old school” thinking to me.  Rust is not the dinky little project it used to be, where we all knew everybody. Rust is used by <a href="https://www.slashdata.co/blog/state-of-the-developer-nation-23rd-edition-the-fall-of-web-frameworks-coding-languages-blockchain-and-more/">millions of developers</a> and is one of <a href="https://www.oreilly.com/radar/technology-trends-for-2023/">the fastest growing language today</a>; it <a href="https://www.theregister.com/2022/09/20/rust_microsoft_c/">powers</a> <a href="https://aws.amazon.com/blogs/opensource/why-aws-loves-rust-and-how-wed-like-to-help/">the cloud</a> and it’s quite possibly in <a href="https://github.com/Rust-for-Linux">your</a> <a href="https://www.bleepingcomputer.com/news/microsoft/new-windows-11-build-ships-with-more-rust-based-kernel-features/">kernel</a>. In many ways, this growth hasn’t caught up with the open source org: I’d still like to see more companies hiring dedicated Rust teams of Rust developers, or giving their employees paid time to work on Rust<sup id="fnref:9"><a href="#fn:9" class="footnote-ref" role="doc-noteref">9</a></sup>. But I think that growth is coming, especially if we work harder at harnessing it, and I am very excited about what that can mean.</p>
<h3 id="nothing-succeeds-like-success">Nothing succeeds like success</h3>
<p>Now I know that when we talk about burnout, we’re also talking about other kinds of drama. Maybe you think that things like ‘working iteratively’ and having more people or resources are not going to help when the problem is conflicts between people or organizations. And you’re not wrong, it’s not going to solve all conflict. But I also think that an awful lot of conflict ultimately comes out of zero-sum, scarcity-oriented thinking, or from feeling disempowered to achieve the goals you set out to do. To help with burnout, we need to do better at a number of things, including I think helping each other to <a href="https://smallcultfollowing.com/babysteps/blog/2023/09/27/empathy-in-open-source/">practice empathy</a> and manage conflict more productively<sup id="fnref:10"><a href="#fn:10" class="footnote-ref" role="doc-noteref">10</a></sup>, but I think we also need to do better at shipping product.</p>
<h3 id="dont-be-afraid-to-fail--you-got-this">Don’t be afraid to fail — you got this</h3>
<p>One of my favorite conversations from the whole conference happened after the conference itself. I was in the midst of pitching Jack Huey on some of the organizational ideas that I’m really excited about right now, which I think can help bring the Rust project closer to being the empowering, inclusive open-source project it aspires to be. Jack wasn’t sure if they were going to work. “But”, he said, “what the heck, let’s try it! I mean, what have we got to lose? If it doesn’t work, we’ll learn something, and do something else.”<sup id="fnref:11"><a href="#fn:11" class="footnote-ref" role="doc-noteref">11</a></sup> Hell yes.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>As I usually do, I’ve <a href="https://github.com/nikomatsakis/eurorust-2023">put my slides online</a>. If you’re curious, take a look! If you see a typo, maybe open a PR. The speaker notes have some of the “soundrack”, though not all of it.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Somehow, I never made it to a RustFest.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>You can find the <a href="https://hackmd.io/cO1NJWTHTVihbE0UCWyRfg">agenda</a> here. It contains links to the briefing documents that we prepared in advance, along with loose notes that we took during the discussions. I expect we’ll author a blog post covering the key developments on the Inside Rust blog.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>Including one I can&rsquo;t <em>wait</em> to read about <a href="https://www.shuttle.rs/blog/2023/08/30/using-oauth-with-axum">OAuth</a> &ndash; I tried to understand Github&rsquo;s docs on OAuth and just got completely lost.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>Side note, but I think Rust 2024 is shaping up to be another hugely impactful edition. There&rsquo;s a very good chance we&rsquo;ll have <a href="https://blog.rust-lang.org/inside-rust/2023/05/03/stabilizing-async-fn-in-trait.html">async functions in traits</a>, <a href="https://rust-lang.github.io/impl-trait-initiative/explainer/tait.html">type alias impl trait</a>, and <a href="https://blog.rust-lang.org/inside-rust/2023/10/06/polonius-update.html">polonius</a>, each of which is a massive usability and expressiveness win. I&rsquo;m hoping we&rsquo;ll also get <a href="https://smallcultfollowing.com/babysteps/blog/2023/03/15/temporary-lifetimes/">improved temporary lifetimes</a> in the new edition, eliminating the &ldquo;blocking bugs&rdquo; <a href="https://cseweb.ucsd.edu/~yiying/RustStudy-PLDI20.pdf">identified as among the most common in real-world Rust programs</a>. And of course the last few years have already seen let-else, scoped threads, cargo add, and a variety of other changes. Gonna be great!&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p>INHTWAMA was the rather awkward (and inaccurate) acronym that we gave to the idea of “aliasing xor mutation” — i.e., the key principle underlying Rust’s borrow checker. The name comes from a blog post I wrote called <a href="https://smallcultfollowing.com/babysteps/blog/2012/11/18/imagine-never-hearing-the-phrase-aliasable/">“Imagine never hearing the phrase aliasable, mutable again”</a>, which @pcwalton incorrectly remembered as “Imagine never hearing the <em>words</em> aliasable, mutable again”, and hence shortened to INHTWAMA. I notice now though that this acronym was also frequently mutated to I<em>M</em>HTWAMA which just makes no sense at all.&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:7">
<p>I learned a lot from reading Rust’s <code>Makefile</code> in the early days. I had no idea you could model function calls in <code>make</code> with macros. Brilliant. I’ve always deeply admired Graydon’s <code>Makefile</code> wizardry there, though it occurs to me now that I never checked the git logs &ndash; maybe it was somebody else! I&rsquo;ll have to go look later.&#160;<a href="#fnref:7" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:8">
<p>Side note, but more often than not, I think cjgillot’s approaches are not going to work. And so far I’m 0 for 2 on this, he’s always been right. To <a href="https://twitter.com/BrendanEich/status/1456758350419480580">paraphrase Brendan Eich</a>, “always bet on cjgillot”.&#160;<a href="#fnref:8" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:9">
<p>And I have some thoughts on how we can do better at encouraging them! More on that in some later posts.&#160;<a href="#fnref:9" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:10">
<p>One of the biggest lessons for me in my personal life has been realizing that not telling people when I feel upset is not necessarily being kind to them and certainly not kind to myself. It seems like avoiding conflict, but it can actually lead to much larger conflicts down the line.&#160;<a href="#fnref:10" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:11">
<p>Full confession, this quote is made up out of thin air. I have no memory of what words he used. But this is what he meant!&#160;<a href="#fnref:11" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><title type="html">Easing tradeoffs with profiles</title><link href="https://smallcultfollowing.com/babysteps/blog/2023/09/30/profiles/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2023/09/30/profiles/</id><published>2023-09-30T00:00:00+00:00</published><updated>2023-09-30T10:56:35-04:00</updated><content type="html"><![CDATA[<p>Rust helps you to build reliable programs. One of the ways it does that is by surfacing things to your attention that you really ought to care about. Think of the way we handle errors with <code>Result</code>: if some operation can fail, you can&rsquo;t, ahem, fail to recognize that, because you have to account for the error case. And yet often the kinds of things you care about depend on the kind of application you are building. A classic example is memory allocation, which for many Rust apps is No Big Deal, but for others is something to be done carefully, and for still others is completely verboten. But this pattern crops up a lot. I&rsquo;ve heard and like the framing of designing for &ldquo;what do you have to pay attention to&rdquo; &ndash; Rust currently aims for a balance that errs on the side of paying attention to more things, but tries to make them easy to manage. But this post is about a speculative idea of how we could do better than that by allowing programs to declare a <strong>profile</strong>.</p>
<h2 id="profiles-declare-what-you-want-to-pay-attention-to">Profiles declare what you want to pay attention to</h2>
<p>The core idea is pretty simple. A <strong>profile</strong> would be declared, I think, in the <code>Cargo.toml</code>. Profiles would <strong>never</strong> change the semantics of your Rust code. You could always copy and paste code between Rust projects with different profiles and things would work the same. But it <strong>would</strong> adjust lint settings and errors. So if you copy code from a more lenient profile into your more stringent project, you might find that it gets warnings or errors it didn&rsquo;t get before.</p>
<h2 id="primarily-this-means-lints">Primarily, this means lints</h2>
<p>In effect, a profile would be a lot like a lint group. So if we have a profile for kernel development, this would turn on various lints that help to detect things that kernel developers really care about &ndash; unexpected memory allocation, potential panics &ndash; but other projects don&rsquo;t. Much like Rust-for-linux&rsquo;s existing <a href="https://github.com/Rust-for-Linux/klint">klint</a> project.</p>
<p>So why not just make it a lint group? Well, actually, maybe we should &ndash; but I thought <code>Cargo.toml</code> would be better because it would allow us to apply more stringent checks to what dependencies you use, which features they use, etc. For example, maybe dependencies could declare that some of their features are not well suited to certain profiles, and you would get a warning if your application winds up depending on them. I imagine would select a profile when running <code>cargo new</code>.</p>
<h2 id="example-autoclone-for-rc-and-arc">Example: autoclone for <code>Rc</code> and <code>Arc</code></h2>
<p>Let&rsquo;s give an example of how this might work. In Rust today, if you want to have many handles to the same value, you can use a reference counted type like <code>Rc</code> or <code>Arc</code>. But whenever you want to get a new handle to that value, you have to explicit <code>clone</code> it:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">map</span>: <span class="nc">Rc</span><span class="o">&lt;</span><span class="n">HashMap</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">create_map</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">map2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">map</span><span class="p">.</span><span class="n">clone</span><span class="p">();</span><span class="w"> </span><span class="c1">// 👈 Clone!
</span></span></span></code></pre></div><p>The idea of this <code>clone</code> is to call attention to the fact that custom code is executing here. This is not just a <code>memcpy</code><sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>. I&rsquo;ve been grateful for this some of the time. For example, when optimizing a concurrent data structure, I really like knowing exactly when one of my reference counts is going to change. But a lot of the time, these calls to clone are just noise, and I wish I could just write <code>let map2 = map</code> and be done with it.</p>
<p>So what if we modify the compiler as follows. Today, when you move out from a variable, you effectively get an error if that is not the &ldquo;last use&rdquo; of the variable:</p>
<pre tabindex="0"><code class="language-rust=" data-lang="rust=">let a = v; // move out from `v` here...
...
read(&amp;v); // 💥 ...so we get an error when we use `v`.
</code></pre><p>What if, instead, when you move out from a value and it is not the last use, we introduce an <em>auto-clone</em> operation. This may fail if the type is not auto-cloneable (e.g., a <code>Vec</code>), but for <code>Rc</code>, <code>Arc</code>, and other O(1) clone operations, it would be equivalent to <code>x.clone()</code>. We could designate which types can be auto-cloneable by extra marker traits, for example. This means that <code>let a = v</code> above would be equivalent to <code>let a = v.clone()</code>.</p>
<p>Now, here comes the interesing part. When we introduce an auto-clone, we would also introduce a lint: implicit clone operation. In the higher-level profile, this lint would be <code>allow</code>-by-default, but in the profile for lower-level code, if would be <code>deny</code>-by-default, with an auto-fix to insert <code>clone</code>. Now when I&rsquo;m editing my concurrent data structure, I still get to see the <code>clone</code> operations explicitly, but when I&rsquo;m writing my application code, I don&rsquo;t have to think about it.</p>
<h2 id="example-dynamic-dispatch-with-async-trait">Example: dynamic dispatch with async trait</h2>
<p>Here&rsquo;s another example. Last year we spent a while exploring the ways that we can enable dynamic dispatch for traits that use async functions. We landed on a design that seemed like it hit a sweet spot. Most users could just use traits with async functions like normal, but they might get some implicit allocations. Users who cared could use other allocation strategies by being more explicit about things. (<a href="https://hackmd.io/@nikomatsakis/SJ2-az7sc">You can read about the design here.</a>) But, as I described in my blog post <a href="https://smallcultfollowing.com/babysteps/blog/2022/09/18/dyn-async-traits-part-8-the-soul-of-rust/">The Soul of Rust</a>, this design had a crucial flaw: although it was still <em>possible</em> to avoid allocation, it was no longer <em>easy</em>.  This seemed to push Rust over the line from its current position as a systems language that can claim to be a true C alternative into a &ldquo;just another higher-level language that can be made low-level if you program with care&rdquo;.</p>
<p>But profiles seem to offer another alternative. We could go with our original design, but whenever the compiler inserted an adapter that might cause boxing to occur, it would issue a lint warning. In the higher-level profile, the warning would be <code>allow</code>-by-default, but in the lower-level profile, it would by <code>deny</code>-by-default.</p>
<h2 id="example-panic-effects-or-other-capabilities">Example: panic effects or other capabilities</h2>
<p>If you really want to go crazy, we can use annotations to signal various kinds of effects. For example, one way to achieve panic safety, we might allow functions to be annotated with <code>#[panics]</code>, signaling a function that <em>might</em> panic. Depending on the profile, this might require you to declare that the caller may panic (similar to how <code>unsafe</code> works now).</p>
<p>Depending how far we want to go here, we would ultimately have to integrate these kind of checks more deeply into the type system. For example, if you have a <code>fn</code>-pointer, or a <code>dyn Trait</code> call, we would have to introduce &ldquo;may panic&rdquo; effects into the type system to be able to track that information (but we could be conservative and just assume calls by pointer may panic, for example). But we could likely still use profiles to control how much you as the caller choose to care.</p>
<h2 id="changing-the-profile-for-a-module-or-a-function">Changing the profile for a module or a function</h2>
<p>Because profiles primarily address lints, we can also allow you to change the profile in a more narrow way. This could be done with lint groups (maybe each profile is a lint group), or perhaps with a <code>#![profile]</code> annotation.</p>
<h2 id="why-i-care-profiles-could-open-up-design-space">Why I care: profiles could open up design space</h2>
<p>So why am I writing about profiles? In short, I&rsquo;m looking for opportunities to do the classic Rust thing of trying to have our cake and eat it too. I want Rust to be versatile, suitable for projects up and down the stack. I know that many projects contain hot spots or core bits of the code where the details matter quite a bit, and then large swaths of code where they don&rsquo;t matter a jot. I&rsquo;d like to have a Rust that feels closer to Swift that I can use most of the time, and then the ability to &ldquo;dial up&rdquo; the detail level for the code where I do care.</p>
<h2 id="conclusion-the-core-principles">Conclusion: the core principles</h2>
<p>I do want to emphasize that this idea is <strong>speculation</strong>. As far as I know, nobody else on the lang team is into this idea &ndash; most of them haven&rsquo;t even heard about it!</p>
<p>I also am not hung up on the details. Maybe we can implement profiles with some well-named lint groups. Or maybe, as I proposed, it should go in <code>Cargo.toml</code>.</p>
<p>What I do care about are the core principles of what I am proposing:</p>
<ul>
<li>Defining some small set of <strong>profiles</strong> for Rust applications that define the <strong>kinds of things you want to care about</strong> in that code.
<ul>
<li>I think these should be global and not user-defined. This will allow profiles to work more smoothly across dependencies. Plus we can always allow user-defined profiles or something later if want.</li>
</ul>
</li>
<li>Profiles <strong>never change</strong> what code will do when it runs, but they can make code <strong>get more warnings or errors</strong>.
<ul>
<li>You can always copy-and-paste code between applications without fear that it will behave differently (though it may not compile).</li>
<li>You can always understand what Rust code will do without knowing the profile or context it is running in.</li>
</ul>
</li>
<li>Profiles let us do more implicit things to <strong>ease ergonomics</strong> without making Rust inapplicable for other use cases.
<ul>
<li>Looking at Aaron Turon&rsquo;s classic post introducing the lang team&rsquo;s <a href="https://blog.rust-lang.org/2017/03/02/lang-ergonomics.html">Rust 2018 ergonomics initiative</a>, profiles let users dial down the <strong>context dependence</strong> and <strong>applicability</strong> of any particular change.</li>
</ul>
</li>
</ul>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Back in the early days of Rust, we debated a lot about what ought to be the rule for when clone was required. I think the current rule of &ldquo;memcpy is quiet, everything else is not&rdquo; is pretty decent, but it&rsquo;s not ideal in a few ways. For example, an O(1) clone operation like incrementing a refcount is <em>not</em> the same as an O(n) operation like cloning a vector, and yet they look the same. Moreover, memcpy&rsquo;ing a giant array (or <code>Future</code>) can be a real performance footgun (not to mention blowing up your stack), and yet we let you do that quite quietly. This is a good example of where profiles could help, I believe.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>]]></content></entry><entry><title type="html">Polonius revisited, part 2</title><link href="https://smallcultfollowing.com/babysteps/blog/2023/09/29/polonius-part-2/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2023/09/29/polonius-part-2/</id><published>2023-09-29T00:00:00+00:00</published><updated>2023-09-29T06:43:09-04:00</updated><content type="html"><![CDATA[<p>In the <a href="https://smallcultfollowing.com/babysteps/blog/2023/09/22/polonius-part-1/">previous Polonius post</a>, we formulated the original borrow checker in a Polonius-like style. In this post, we are going to explore how we can extend that formulation to be flow-sensitive. In so doing, we will enable the original Polonius goals, but also overcome some of its shortcomings. I believe this formulation is also more amenable to efficient implementation. As I&rsquo;ll cover at the end, though, I do find myself wondering if there&rsquo;s still more room for improvement.</p>
<h2 id="running-example">Running example</h2>
<p>We will be working from the same Rust example as the original post, but focusing especially on the mutation in the <code>false</code> branch<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">22</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">44</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">p</span>: <span class="kp">&amp;</span><span class="na">&#39;</span><span class="mi">0</span><span class="w"> </span><span class="kt">u32</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">x</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">y</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">q</span>: <span class="kp">&amp;</span><span class="na">&#39;</span><span class="mi">1</span><span class="w"> </span><span class="kt">u32</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">y</span><span class="p">;</span><span class="w"> </span><span class="c1">// Borrow `y` here (L1)
</span></span></span><span class="line"><span class="cl"><span class="k">if</span><span class="w"> </span><span class="n">something</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">p</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">q</span><span class="p">;</span><span class="w">  </span><span class="c1">// Store borrow into `p`
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">x</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">y</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w"> </span><span class="c1">// Mutate `y` on `false` branch
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">y</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">read_value</span><span class="p">(</span><span class="n">p</span><span class="p">);</span><span class="w"> </span><span class="c1">// May refer to `x` or `y`
</span></span></span></code></pre></div><p>There is no reason to have an error on this line. There <em>is</em> a borrow of <code>y</code>, but on the <code>false</code> branch that borrow is only stored in <code>q</code>, and <code>q</code> will never be read again. So there cannot be undefined behavior (UB).</p>
<h2 id="existing-borrow-checker-flags-an-error">Existing borrow checker flags an error</h2>
<p>The existing borrow checker, however, is not that smart. It sees <code>read_value(p)</code> at the end and, because that line could potentially read <code>x</code> or <code>y</code>, it flags the <code>y += 1</code> as an error. When expressed this way, maybe you can have some sympathy for the poor borrow checker &ndash; it&rsquo;s not an unreasonable conclusion! But it&rsquo;s wrong.</p>
<p>The core issue of the existing borrow check stems from its use of a <a href="https://smallcultfollowing.com/babysteps/blog/2023/09/22/polonius-part-1/#compute-the-subset-graph"><em>flow insensitive</em> subset graph</a>. This in turn is related to how it does the type check. In Polonius today, each variable has a single type and hence a single origin (e.g., <code>q: &amp;'1 u32</code>). This causes us to conflate all the possible loans that the variable may refer to throughout execution. And yet as we have seen, this information is actually flow dependent.</p>
<p>The borrow checker today is based on a pretty standard style of type checker applied to <a href="https://smallcultfollowing.com/babysteps/blog/2023/09/22/polonius-part-1/#construct-the-mir">the MIR</a>. Essentially there is an <strong>environment</strong> that maps each variable to a type.</p>
<pre tabindex="0"><code>Env  = { X -&gt; Type }
Type = scalar | &amp; &#39;Y T | ...
</code></pre><p>Then we have type-checking <a href="https://en.wikipedia.org/wiki/Rule_of_inference">inference rules</a> that thread this same environment everywhere. Conceptually the structure of the the rules is as follows:</p>
<pre tabindex="0"><code>construct Env from local variable declarations
Env |- each basic block type checks
--------------------------
the MIR type checks
</code></pre><p>Type-checking a <a href="https://rustc-dev-guide.rust-lang.org/mir/index.html?highlight=places%3A#key-mir-vocabulary">place</a> then uses this <code>Env</code>, bottoming out in an inference rule like:</p>
<pre tabindex="0"><code>Env[X] = T
-------------
Env |- X : T
</code></pre><h2 id="flow-sensitive-type-check">Flow-sensitive type check</h2>
<p>The key thing that makes the borrow checker <em>flow insensitive</em> is that we use the same environment at all points. What if instead we had one environment <em>per program point</em>:</p>
<pre tabindex="0"><code>EnvAt = { Point -&gt; Env }
</code></pre><p>Whenever we type check a statement at program point <code>A</code>, we will use <code>EnvAt[A]</code> as its environment. When program point <code>A</code> flows into point <code>B</code>, then the environment at <code>A</code> must be a <em>subenvironment</em> of the environment at <code>B</code>, which we write as <code>EnvAt[A] &lt;: EnvAt[B]</code>.</p>
<p>The subenvironment relationship <code>Env1 &lt;: Env2</code> holds if</p>
<ul>
<li>for each variable <code>X</code> in <code>Env2</code>:
<ul>
<li><code>X</code> appears in <code>Env1</code></li>
<li><code>Env1[X] &lt;: Env2[X]</code></li>
</ul>
</li>
</ul>
<p>There are two interesting things here. The first is that the <strong>set of variables can change over time</strong>. The idea is that once a variable goes dead, you can drop it from the environment. The second is that <strong>the type of the variable can change according to the subtyping rules</strong>.</p>
<p>You can think of flow-sensitive typing as if, for each program variable like <code>q</code>, we have a separate copy per program point, so <code>q@A</code> for point <code>A</code> and <code>q@B</code> for point at <code>B</code>. When we flow from one point to another, we assign from <code>q@A</code> to <code>q@B</code>. Like any assignment, this would require the type of <code>q@A</code> to be a subtype of the type of <code>q@B</code>.</p>
<h2 id="flow-sensitive-typing-in-our-example">Flow-sensitive typing in our example</h2>
<p>Let&rsquo;s see how this idea of a flow-sensitive type check plays out for our example. First, recall the MIR for our example from the <a href="https://smallcultfollowing.com/babysteps/blog/2023/09/22/polonius-part-1/">previous post</a>:</p>
<pre class="mermaid">flowchart TD
  Intro --> BB1
  Intro["let mut x: i32\nlet mut y: i32\nlet mut p: &'0 i32\nlet mut q: &'1 i32"]
  BB1["<b><u>BB1:</u></b>\np = &x;\ny = y + 1;\nq = &y;\nif something goto BB2 else BB3"]
  BB1 --> BB2
  BB1 --> BB3
  BB2["<b><u>BB2</u></b>\np = q;\nx = x + 1;\n"]
  BB3["<b><u>BB3</u></b>\ny = y + 1;"]
  BB2 --> BB4;
  BB3 --> BB4;
  BB4["<b><u>BB4</u></b>\ny = y + 1;\nread_value(p);\n"]

  classDef default text-align:left,fill-opacity:0;
  </pre>
<h3 id="one-environment-per-program-point">One environment per program point</h3>
<p>In the original, flow-insensitive type check, the first thing we did was to create origin variables (<code>'0</code>, <code>'1</code>) for each of the origins that appear in our types. You can see those variables in the chart above. So we effectively had an environment like</p>
<pre tabindex="0"><code>Env_flow_insensitive = {
    p: &amp;&#39;0 i32,
    q: &amp;&#39;1 i32,
}
</code></pre><p>But now we are going to have one environment per program point. There is one program point in between each MIR statement. So the point <code>BB1_0</code> would be the entry to basic block <code>BB1</code>, and <code>BB1_1</code> would be after the first statement. So we have <code>Env_BB1_0</code>, <code>Env_BB1_1</code>, etc. We are going to create distinct origin variables for each of them:</p>
<pre tabindex="0"><code>Env_BB1_0 = {
    p: &amp;&#39;0_BB1_0 i32,
    q: &amp;&#39;1_BB1_0 i32,
}

Env_BB1_1 = {
    p: &amp;&#39;0_BB1_1 i32,
    q: &amp;&#39;1_BB1_1 i32,
}

...
</code></pre><h3 id="type-checking-the-edge-from-bb1-to-bb2">Type-checking the edge from BB1 to BB2</h3>
<p>Let&rsquo;s look at point <code>BB1_3</code>, which is the final line in BB1, which in MIR-speak is called the <em>terminator</em>. It is an <em>if</em> terminator (<code>if something goto BB2 else BB3</code>). To type-check it, we will take the environment on entry (<code>Env_BB1_3</code>) and require that it is a sub-environment of the environment on entry to the true branch (<code>Env_BB2_0</code>) and on entry to the false branch (<code>Env1_BB3_0</code>).</p>
<p>Let&rsquo;s start with the <em>true branch</em>. Here we have the environment <code>Env_BB2_0</code>:</p>
<pre tabindex="0"><code>Env_BB2_0 = {
    q: &amp;&#39;1_BB2_0 i32,
}
</code></pre><p>You should notice something curious here &ndash; why is there no entry for <code>p</code>? The reason is that the variable <code>p</code> is <strong>dead</strong> on entry to BB2, because its current value is about to be overridden. The type checker knows not to include dead variables in the environment.</p>
<p>This means that&hellip;</p>
<ul>
<li><code>Env_BB1_3 &lt;: Env_BB2_0</code> if the type of <code>q</code> at <code>BB1_3</code> is a subtype of the type of <code>q</code> at <code>BB2_0</code>&hellip;</li>
<li>&hellip;so <code>&amp;'1_BB1_3 i32 &lt;: &amp;'1_BB2_0 i32</code> must hold&hellip;</li>
<li>&hellip;so <code>'1_BB1_3 : '1_BB2_0</code> must hold.</li>
</ul>
<p>What we just found then is that, because of the edge from BB1 to BB2, the version of <code>'1</code> on exit from BB1 flows into <code>'1</code> on entry to BB2.</p>
<h3 id="type-checking-the-p--q-assignment">Type-checking the <code>p = q</code> assignment</h3>
<p>let&rsquo;s look at the assignment <code>p = q</code>. This occurs in statement BB2_0. The environment before we just saw:</p>
<pre tabindex="0"><code>Env_BB2_0 = {
    q: &amp;&#39;1_BB2_0 i32,
}
</code></pre><p>For an assignment, we take the type of the left-hand side (<code>p</code>) from the environment <em>after</em>, because that is what we are storing into. The environment after is <code>Env_BB2_1</code>:</p>
<pre tabindex="0"><code>Env_BB2_1 = {
    p: &amp;&#39;0_BB2_1 i32,
}
</code></pre><p>And so to type check the statement, we get that <code>&amp;'1_BB2_0 i32 &lt;: &amp;'0 BB2_1 i32</code>, or <code>'1_BB2_0 : '0_BB2_1</code>.</p>
<p>In addition to this relation from the assignment, we also have to make the environment <code>Env_BB2_0</code> be a subenvironment of the env after <code>Env_BB2_1</code>. But since the set of live variables are disjoint, in this case, that doesn&rsquo;t add anything to the picture.</p>
<h3 id="type-checking-the-edge-from-bb1-to-bb3">Type-checking the edge from BB1 to BB3</h3>
<p>As the final example, let&rsquo;s look at the <em>false</em> edge from BB1 to BB3. On entry to BB3, the variable <code>q</code> is dead but <code>p</code> is not, so the environment looks like</p>
<pre tabindex="0"><code>Env_BB3_0 = {
    p: &amp;&#39;0_BB3_0 i32,
}
</code></pre><p>Following a similar process to before, we conclude that <code>'0_BB1_3 : '0_BB3_0</code>.</p>
<h2 id="building-the-flow-sensitive-subset-graph">Building the flow-sensitive subset graph</h2>
<p>We are now starting to see how we can build a <strong>flow-sensitive</strong> version of the flow graph. Instead of having one node in the graph per origin variable, we now have one node in the graph per origin variable per program point, and we create an edge <code>N1 -&gt; N2</code> between two nodes if the type check requires that <code>N1 : N2</code>, just as before. Basically the only difference is that we have a lot more nodes.</p>
<p>Putting together what we saw thus far, we can construct a subset graph for this program like the following. I&rsquo;ve excluded nodes that correspond to dead variables &ndash; so for example there is no node <code>'1_BB1_0</code>, because <code>'1</code> appears in the variable <code>q</code>, and <code>q</code> is dead at the start of the program.</p>
<pre class="mermaid">flowchart TD
    subgraph "'0"
        N0_BB1_0["'0_BB1_0"]
        N0_BB1_1["'0_BB1_1"]
        N0_BB1_2["'0_BB1_2"]
        N0_BB1_3["'0_BB1_3"]
        N0_BB2_1["'0_BB2_1"]
        N0_BB3_0["'0_BB3_0"]
        N0_BB4_0["'0_BB4_0"]
        N0_BB4_1["'0_BB4_1"]
    end

    subgraph "'1"
        N1_BB1_2["'1_BB1_2"]
        N1_BB1_3["'1_BB1_3"]
        N1_BB2_0["'1_BB2_0"]
    end
    
    subgraph "Loans"
        L0["{L0} (&x)"]
        L1["{L1} (&y)"]
    end
    
    L0 --> N0_BB1_0
    L1 --> N1_BB1_2
    
    N0_BB1_0 --> N0_BB1_1 --> N0_BB1_2 --> N0_BB1_3
    N0_BB1_3 --> N0_BB3_0
    N0_BB3_0 --> N0_BB4_0 --> N0_BB4_1
    N0_BB2_1 --> N0_BB4_0

    N1_BB1_2 --> N1_BB1_3
    N1_BB1_3 --> N1_BB2_0
    
    N1_BB2_0 --> N0_BB2_1
  </pre>
<p>Just as before, we can trace back from the node for a particular origin O to find all the loans contained within O. Only this time, the origin O also indicates a program point.</p>
<p>In particular, compare <code>'0_BB3_0</code> (the data reachable from <code>p</code> on the <code>false</code> branch of the if) to <code>'0_BB4_0</code> (the data reachable after the if finishes). We can see that in the first case, the origin can only reference <code>L0</code>, but afterwards, it could reference <code>L1</code>.</p>
<h2 id="active-loans">Active loans</h2>
<p>Just as in described in the <a href="https://smallcultfollowing.com/babysteps/blog/2023/09/22/polonius-part-1/#active-loans">previous post</a>, to complete the analysis we compute the <em>active loans</em>. Active loans are defined in almost exactly the same way, but with one twist. A loan <code>L</code> is <em>active</em> at a program point <code>P</code> if there is a path from the borrow that created <code>L</code> to <code>P</code> where, for each point along the path&hellip;</p>
<ul>
<li>there is some live variable <strong>whose type at <code>P</code></strong> may reference the loan; and,</li>
<li>the place expression that was borrowed by <code>L</code> (here, <code>x</code>) is not reassigned at <code>P</code>.</li>
</ul>
<p>See the bolded test? We are now taking into account the fact that the type of the variable can change along the path. In particular, it may reference distinct origins.</p>
<h3 id="implementing-using-dataflow">Implementing using dataflow</h3>
<p>Just as in the <a href="https://smallcultfollowing.com/babysteps/blog/2023/09/22/polonius-part-1/#implementing-using-dataflow">previous post</a>, we can compute active loans using dataflow. In particular, we <strong>gen</strong> a loan when it is issued, and we <strong>kill</strong> a loan <code>L</code> at a point <code>P</code> if (a) there are no live variables whose origins contain <code>L</code> or (b) the path borrowed by <code>L</code> is assigned at <code>P</code>.</p>
<h3 id="applying-this-to-our-running-example">Applying this to our running example</h3>
<p>When we apply this to our running example, the unnecessary error on the <code>false</code> branch of the <code>if</code> goes away. Let&rsquo;s walk through it.</p>
<h4 id="entry-block">Entry block</h4>
<p>In <code>BB1</code>, we gen <code>L0</code> and <code>L1</code> at their two borrow sites, respectively. As a result, the active loans on exit from <code>BB1</code> wil be <code>{L0, L1}</code>:</p>
<pre class="mermaid">flowchart TD
  Start["..."]
  BB1["<b><u>BB1:</b></u>
       p = &x; <b><i>// Gen: L0</i></b>
       y = y + 1;
       q = &y; <b><i>// Gen: L1</i></b>
       if something goto BB2 else BB3
  "]
  BB2["..."]
  BB3["..."]
  BB4["..."]
 
  Start --> BB1
  BB1 --> BB2
  BB1 --> BB3
  BB2 --> BB4
  BB3 --> BB4
 
  classDef default text-align:left,fill:#ffffff;
  classDef highlight text-align:left,fill:yellow;
  class BB3 highlight
  </pre>
<h4 id="the-false-branch-of-the-if">The <code>false</code> branch of the <code>if</code></h4>
<p>On the <code>false</code> branch of the <code>if</code> (<code>BB3</code>), the only live reference is <code>p</code>, which will be used later on in <code>BB4</code>. In particular, <code>q</code> is dead.</p>
<p>In the flow <strong>insensitive</strong> version, when the borrow checker looked at the type of <code>p</code>, it was <code>p: &amp;'0 i32</code>, and <code>'0</code> had the value <code>{L0, L1}</code>, so the borrow checker concluded that both loans were active.</p>
<p>But in the flow <strong>sensitive</strong> version we are looking at now, the type of <code>p</code> on entry to <code>BB3</code> is <code>p: &amp;'0_BB3_0 i32</code>. And, consulting the subset graph shown earlier in this post, the value of <code>'0_BB3_0</code> is just <code>{L0}</code>. <strong>So there is a <em>kill</em> for <code>L1</code> on entry to the block.</strong> This means that the only active loan is <code>L0</code>, which borrows <code>x</code>. This in turn means that <code>y = y + 1</code> is not an error.</p>
<pre class="mermaid">flowchart TD
  Start["
    ...
  "]
  BB1["
      <b><u>BB1:</u></b>
      p = &x; <b><i>// Gen: L0</i></b>
      ...
      q = &y; <b><i>// Gen: L1</i></b>
      ...
  "]
  BB2["
      <b><u>BB2:</u></b>
      ...
  "]
  BB3["
      <b><u>BB3:</u></b>
      <b><i>// Kill `L1`</i></b> (no live references)
      <b><i>// Active loans: {L0}</i></b>
      y = y + 1;
  "]
  BB4["
      <b><u>BB4:</u></b>
      ...
      read_value(p); // later use of `p`
  "]
 
  Start --> BB1
  BB1 --> BB2
  BB1 --> BB3
  BB2 --> BB4
  BB3 --> BB4
 
  classDef default text-align:left,fill:#ffffff;
  classDef highlight text-align:left,fill:yellow;
  class BB3 highlight
  </pre>
<h2 id="the-role-of-invariance-vec-push-ref">The role of invariance: vec-push-ref</h2>
<p>I didn&rsquo;t highlight it before, but invariance plays a really interesting role in this analysis. Let&rsquo;s see another example, a simplified version of <a href="https://github.com/rust-lang/polonius/blob/0a754a9e1916c0e7d9ba23668ea33249c7a7b59e/inputs/vec-push-ref/vec-push-ref.rs#L5"><code>vec-push-ref</code></a> from polonius:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">v</span>: <span class="nb">Vec</span><span class="o">&lt;&amp;</span><span class="na">&#39;v</span><span class="w"> </span><span class="kt">u32</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">p</span>: <span class="kp">&amp;</span><span class="na">&#39;p</span> <span class="nc">mut</span><span class="w"> </span><span class="nb">Vec</span><span class="o">&lt;&amp;</span><span class="na">&#39;vp</span><span class="w"> </span><span class="kt">u32</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">x</span>: <span class="kt">u32</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="cm">/* P0 */</span><span class="w"> </span><span class="n">v</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="cm">/* P1 */</span><span class="w"> </span><span class="n">p</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">v</span><span class="p">;</span><span class="w"> </span><span class="c1">// Loan L0
</span></span></span><span class="line"><span class="cl"><span class="cm">/* P2 */</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w"> </span><span class="c1">// &lt;-- Expect NO error here.
</span></span></span><span class="line"><span class="cl"><span class="cm">/* P3 */</span><span class="w"> </span><span class="n">p</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="o">&amp;</span><span class="n">x</span><span class="p">);</span><span class="w"> </span><span class="c1">// Loan 1
</span></span></span><span class="line"><span class="cl"><span class="cm">/* P4 */</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w"> </span><span class="c1">// &lt;-- 💥 Expect an error here!
</span></span></span><span class="line"><span class="cl"><span class="cm">/* P5 */</span><span class="w"> </span><span class="nb">drop</span><span class="p">(</span><span class="n">v</span><span class="p">);</span><span class="w">
</span></span></span></code></pre></div><p>What makes this interesting? We create a reference <code>p</code> at point <code>P1</code> that points at <code>v</code>. We then insert a borrow of <code>x</code> into the reference <code>p</code>. <strong>After that point, the reference <code>p</code> is dead, but the loan <code>L1</code> is still active</strong> &ndash; this is because it is also stored in <code>v</code>. This connection between <code>p</code> and <code>v</code> is what is key about this example.</p>
<p>The way that this connection is reflected in the type system is through <em><a href="https://en.wikipedia.org/wiki/Covariance_and_contravariance_(computer_science)">variance</a></em>. In particular, a type <code>&amp;mut T</code> is <strong>invariant</strong> with respect to <code>T</code>. This means that when you assign one reference to another, the type that they reference must be exactly the same.</p>
<p>In terms of the subset graph, invariance works out to creating <strong>bidirectional edges</strong> between origins. Take a look at the resulting subset graph to see what I mean. To keep things simple, I am going to exclude nodes for <code>p</code>: the interesting origins here at <code>'v</code> (the data in the vector <code>v</code>) and <code>'vp</code> (the data in the vector referenced by <code>p</code> &ndash; which is also <code>v</code>).</p>
<pre class="mermaid">flowchart TD
    subgraph "Loans"
      L1["L1 (&x)"]
    end
    
    subgraph "'v"
      V_P0["'v_P0"]
      V_P1["'v_P1"]
      V_P2["'v_P2"]
      V_P3["'v_P3"]
      V_P4["'v_P4"]
      V_P5["'v_P5"]
    end

    subgraph "'vp"
      VP_P1["'vp_P1"]
      VP_P2["'vp_P2"]
      VP_P3["'vp_P3"]
    end

    V_P0 --> V_P1 --> V_P2 --> V_P3 --> V_P4 --> V_P5
    
    V_P1 <---> VP_P1
    VP_P1 <---> VP_P2 <---> VP_P3
        
    L1 --> VP_P3
  </pre>
<p>The key part here are the bidirectional arrows between <code>v_P1</code> and <code>vp_P1</code> and between <code>vp_P1</code> and <code>vp_P3</code>. How did those come about?</p>
<ul>
<li>The first edge resulted from <code>p = &amp;mut v</code>. The type of <code>v</code> (at <code>P1</code>) is <code>Vec&lt;&amp;'v_P1 u32&gt;</code>, and that type had to be equal to the referent of <code>p</code> (<code>Vec&lt;&amp;'vp_P1 u32&gt;</code>). Since the types must be equal, that means <code>'v_P1: 'vp_P1</code> and vice versa, hence a bidirectional arrow.</li>
<li>The second edge resulted from the flow from <code>P1</code> to <code>P3</code>. The variable <code>p</code> is live across that edge, so its type before (<code>&amp;'p_P1 mut Vec&lt;&amp;'vp_P1 u32&gt;</code>) must be a subtype of its type after (<code>&amp;'p_P3 mut Vec&lt;&amp;'vp_P3 u32&gt;</code>). Because <code>&amp;mut</code> references are invariant with respect to their referent types, this implies that <code>'vp_P1</code> and <code>'vp_P3</code> must be equal.</li>
</ul>
<p>Put all together, and we see that <code>L1</code> can reach <code>'v_P4</code> and <code>'v_P5</code>, even though it only flowed into an earlier point in the graph. That&rsquo;s cool! We will get the error we expect.</p>
<p>On the other hand, we can also see that there is some imprecision introduced through invariance. The loan <code>L1</code> is introduced at point <code>P3</code>, and yet it appears to flow from <code>'vp_P3</code> backwards in time to <code>'vp_P2</code>, <code>'vp_P1</code>, over to <code>'v_P1</code>, and downward from there. If we were <em>only</em> looking at the subset graph, then, we would conclude that both <code>x += 1</code> statements in this program are illegal, but in fact only the second one causes a problem.</p>
<h3 id="active-loans-to-the-rescue-again">Active loans to the rescue (again)</h3>
<p>The imprecision we see here is very similar to the imprecision we saw in the original polonius. Effectively, invariance is taking away some of our flow sensitivity. Interestingly, the active loans portion of the analysis makes up for this, in the same way that it did in the <a href="https://smallcultfollowing.com/babysteps/blog/2023/09/22/polonius-part-1/#active-loans">previous post</a>. In vec-push-ref, <code>L1</code> will only be generated at <code>P3</code>, so even though it can reach <code>'v_P2</code> via the subset graph, it is not considered active at <code>P2</code>. But once it is generated, it is not killed, even when <code>p</code> goes dead, because it can flow into <code>'v_P4</code>. Therefore we get the one error we expect.</p>
<h2 id="conclusion">Conclusion</h2>
<p>I&rsquo;m going to stop this post here. I&rsquo;ve described a version of polonius where we give variables distinct types at each program point and then relate those types together to create an improved subset graph. This graph increases the precision of the active loans analysis such that we don&rsquo;t get as many false errors, but it is still imprecise in some ways.</p>
<p>I think this formulation is interesting for a few reasons. First, the most expensive part of it is going to be the subset graph, which has a LOT of nodes and edges. But that can be compressed significantly with some simple heuristics. Moreover, the core operation we perform on that graph is reachability, and that can be implemented quite efficiently as well (do a <a href="https://en.wikipedia.org/wiki/Strongly_connected_component">strongly connected components</a> computation to reduce the graph to a tree, and then you can assign pre- and post-orderings and just compare indices). So I believe it could scale in practice.</p>
<p>I have worked through a few more classic examples, and I may come back to them in future posts, so far this analysis seems to get the results I expect. However, I would also like to go back and compare it more deeply to the original polonius, as well as to some of the formulations that came out of academia. There is still something odd about leaning on the dataflow check. I hope to talk about some of that in follow-up posts (or perhaps on Zulip or elsewhere with some of you readers!).</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>If this particular example feels artificial, that&rsquo;s because it is. But similar errors cause more common errors, most notably <a href="https://rust-lang.github.io/rfcs/2094-nll.html#problem-case-3-conditional-control-flow-across-functions">Problem Case #3</a>.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/polonius-revisited" term="polonius-revisited" label="Polonius revisited"/></entry><entry><title type="html">Empathy in open source: be gentle with each other</title><link href="https://smallcultfollowing.com/babysteps/blog/2023/09/27/empathy-in-open-source/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2023/09/27/empathy-in-open-source/</id><published>2023-09-27T00:00:00+00:00</published><updated>2023-09-27T11:50:51-04:00</updated><content type="html"><![CDATA[<p>Over the last few weeks I had been preparing a talk on “Inclusive Mentoring: Mentoring Across Differences” with one of my good friends at Amazon. Unfortunately, that talk got canceled because I came down with COVID when we were supposed to be presenting. But the themes we covered in the talk have been rattling in my brain ever since, and suddenly I’m seeing them everywhere. One of the big ones was about <em>empathy</em> — what it is, what it isn’t, and how you can practice it. Now that I’m thinking about it, I see empathy so often in open source.</p>
<h2 id="what-empathy-is">What empathy is</h2>
<p>In her book Atlas of the Heart<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>, Brené Brown defines <strong>empathy</strong> as</p>
<blockquote>
<p>an emotional skill set that allows us to understand what someone is experiencing and to reflect back that understanding.</p>
</blockquote>
<p>Empathy is not about being <strong>nice</strong> or making the other person feel good or even feel better<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>. Being empathetic means <strong>understanding what the other person feels</strong> and then <strong>showing them that you understand</strong>.</p>
<p>Understanding what the other person feels doesn’t mean you have to feel the same way. It also doesn’t mean you have to agree with them, or feel that they are “justified” in those feelings. In fact, as I’ll explain in a second, strong feelings and emotion are <em>by design</em> limited in their viewpoints — they are always showing us something, and showing us something real, but they are never showing us the full picture.</p>
<p>Usually we feel multiple, seemingly contradictory things, which can leave everything feeling like a big muddle. The goal, from what I can see, is to be able to pull those multiple feelings apart, understand them, and then &ndash; from a balanced place &ndash; decide how we are going to react to them. Hopefully in real time. Pretty damn hard, in my experience, but something we can get better at.</p>
<h2 id="people-are-not-any-one-thing">People are not any one thing</h2>
<p>Some time back, Aaron Turon introduced me to <a href="https://en.wikipedia.org/wiki/Internal_Family_Systems_Model">Internal Family Systems</a> through the book <a href="https://www.amazon.com/Self-Therapy-Step-Step-Cutting-Edge-Psychotherapy/dp/0984392777">Self Therapy</a><sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>. It’s really had a big influence on how I think about things. The super short version of IFS is “<a href="https://en.wikipedia.org/wiki/Inside_Out_(2015_film)">Inside Out</a> is real”. We are each composites of a number of independent <em>parts</em> which capture pieces of our personality. When we are feeling balanced and whole, we are switching between these parts all the time in reaction to what is going on around us.</p>
<p>But <em>sometimes</em> things go awry. Sometimes, one part will get very alarmed about what it perceives to be happening, and it will take complete control of you. This is called <strong>blending</strong>. While you are blended, the part is doing its best to help you in the ways that it knows: that might mean making you super anxious, so that you identify risks, or it might mean making you yell at people, so that they will go away and you don’t have to risk them letting you down. No matter which part you are blended with in the moment, though, you lose access to your whole self and your full range of capabilities. Even though the part will help you solve the immediate problem, it often does so in ways that create other problems down the line.</p>
<p>This concept of parts has really helped me to understand myself, but it has also helped me to understand what previously seemed like contradictory behavior in other people. The reason that people sometimes act in extreme ways, ways that seem so different from the person I know at other times, is because they’re <em>blended</em> — <strong>they’re not the person I know at that time</strong>, they’re just <strong>one part</strong> of that person. And probably a part that has helped them through some tough times in the past.</p>
<h2 id="empathy-as-holding-space">Empathy as “holding space”</h2>
<p>I’ve often heard the term ‘emotional labor’ and, to be honest, I had a hard time connecting to it. But in <a href="https://www.penguinrandomhouse.com/books/608716/love-and-rage-by-lama-rod-owens/">Lama Rod Owen’s “Love and Rage”</a>, he talks about emotional labor in terms of <em>“the work we do to help people process their emotions”</em> and, in particular, gives this list of examples:</p>
<blockquote>
<p>This includes actively listening to others, asking how people are feeling, checking in with them, letting them vent in front of you, and not reacting to someone when they are being rude or disrespectful.</p>
</blockquote>
<p>Now this list struck a chord with me. To me, the hardest part of empathy is <em>holding space</em> — letting someone have a reaction or a feeling without turning away. When people are reacting in an extreme way — whether it’s venting or being rude — it makes us uncomfortable, and often we’ll try to make them stop. This can take many forms. It could mean changing the topic, dismissing it (“get over it”, “I’m sure they didn’t mean it like that”), or trying to fix it (“what you need to do is…”, “let’s go kick their ass!”) For me, when people do that, it makes me feel unseen and kind of upset. Even if the other person is getting righteously angry on my behalf, I feel like suddenly the situation isn’t about <em>me</em> and how <em>I</em> want to think about things.</p>
<h2 id="what-does-all-this-have-to-do-with-github">What does all this have to do with Github?</h2>
<p>At this point you might be wondering “what do obscure therapeutic processes and buddhist philosophy have to do with Github issue threads?” Take another look at Lama Rod Owens’s list of examples of emotional labor, especially the last one:</p>
<blockquote>
<p>not reacting to someone when they are being rude or disrespectful</p>
</blockquote>
<p>To be frank, being an open-source maintainer means taking a lot of shit<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup>. In his insightful, and widely discussed, talk <a href="https://www.youtube.com/watch?v=o_4EX4dPppA">“The Hard Parts of Open Source&quot;</a>, Evan Czaplicki identified many of the “failure modes” of open source comment threads. One very memorable pattern is the “Why don’t you just…” comment, where somebody chimes in with an obvious alternative, as if you hadn’t thought of it. There is also my personal favorite, what I’ll call the “double agent” comment, where someone seems to feel that your goal is actually to ruin the project you’ve put so much effort into, and so comes in hot and angry.</p>
<p>My goal is always to respond to comments as if the commenter had been constructive and polite, or was my best friend. I don’t always achieve my goal, especially in forums where I have to respond quickly<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup>. But I honestly do try. One technique is to find the key points in their comment and rephrase them, to be sure you understand, and then give your take. When I do that, I usually learn things — even when I initially thought somebody was just a blowhard, there is often a strong point underlying their argument, and it may lead me to change course if I listen to it. If nothing else, it’s always good to know the counterarguments in depth.</p>
<h2 id="empathy-as-a-maintainer">Empathy as a maintainer</h2>
<p>And this brings us to the role of <em>empathy</em> as an open-source maintainer. As I said, these days, I see it popping up everywhere. To start, the idea of responding to someone’s comment, even one that feels rude, by identifying the key points they are trying to make feels to me like empathy, even if those points are often highly technical<sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup>. Fundamentally, empathy is all about <em>understanding the other person</em> and <em>letting them know you understand</em>, and that is what I am trying to do here.</p>
<p>But empathy comes into play in a more meta way as well. Trying to think how somebody feels — and <em>why</em> they might be feeling that way — can really help me to step back from feeling angry or injured by the tone of a comment and instead to refocus on what they are trying to communicate to me. Aaron Turon wrote a truly insightful and honest series of posts about his perspective on this called <a href="http://aturon.github.io/tech/2018/05/25/listening-part-1/">Listening and Trust</a>. In <a href="http://aturon.github.io/tech/2018/06/18/listening-part-3/">part 3</a> of that series, he identified some of the key contributors to comment threads that go off the rails, what he called “momentum, urgency, and fatigue”. It’s worth reading that post, or reading it again if you already have. It’s a masterpiece of looking past the immediate reactions to understand better what’s going on, both within others and yourself.</p>
<h2 id="empathy-when-we-surprise-people">Empathy when we surprise people</h2>
<p>When Apple is working on a new product, they keep it absolutely top secret until they are ready &ndash; and then they tell the world, hoping for a big splash. This works for them. In <em>open source</em>, though, it&rsquo;s an anti-pattern. The last thing you want to do is to surprise people &ndash; that&rsquo;s a great way to trigger those parts we were talking about.</p>
<p>The difference, I think, is that open source projects are community projects &ndash; everybody feels some degree of ownership. That&rsquo;s a big part of what makes open source so great! But, at the same time, when somebody starts messing with <em>your stuff</em>, that&rsquo;s sure to get you upset. Paul Ford wrote an article identifying this feeling, which he called <a href="https://www.ftrain.com/wwic">“Why wasn’t I consulted?”</a>.</p>
<p>I find the phrase &ldquo;Why wasn&rsquo;t I consulted?&rdquo; a pretty useful reminder for how it feels, but to be honest I&rsquo;ve never liked it. The problem is that to me it feels condescending. But I totally get the way that people feel. It doesn&rsquo;t always mean I think they&rsquo;re right, or even justified in that feeling. But I get it, and I respect it. Heck, I feel it too!<sup id="fnref:7"><a href="#fn:7" class="footnote-ref" role="doc-noteref">7</a></sup></p>
<p>My personal creed these days is to be as open and transparent as I can with what I am doing and why. It&rsquo;s part of why I love having this blog, since it lets me post up early ideas while I am still thinking about them. This also means I can start to get input and feedback. I don&rsquo;t always listen to that feedback. A lot of times, people hate the things I am talking about, and they&rsquo;re not shy about saying so &ndash; I try to take that as a signal, but just one signal of many. If people are upset, I&rsquo;m probably doing something wrong, but it may not be the idea, it may be the way I am talking about it, or some particular aspect of it.</p>
<h2 id="empathy-when-we-design-our-project-processes">Empathy when we design our project processes</h2>
<p>As I prepared this blog post, I re-read Aaron&rsquo;s <a href="http://aturon.github.io/tech/2018/05/25/listening-part-1/">Listening and Trust</a>, and I was struck again by how many insights he had there. One of them was that by applying empathy, and looking at our processes from the lens of how it <strong>feels</strong> to be a participant &ndash; what concerns get triggered &ndash; we can make changes so that everyone feels more included and less worn down. The key part here is that we have to look not only as how things feel for ourselves, but also how they feel for the participants &ndash; and for those who are not yet participating! There&rsquo;s a huge swath of people who do not join in on Rust discussions, and I think we&rsquo;re really missing out. This kind of design isn&rsquo;t easy, but it&rsquo;s crucial.</p>
<h2 id="empathy-as-a-contributor">Empathy as a contributor</h2>
<p>I’ve focused a lot on the role of empathy as an open-source maintainer. But empathy absolutely comes into play as a contributor. There&rsquo;s a lot said on how people behave differently when commenting on the internet versus in person, and how the tone of a text comment can so easily be misread.</p>
<p>The fact is, when you contribute to an open-source project, the maintainers are going to come up short. They&rsquo;re going to overlook things. They may not respond promptly to your comment or PR &ndash; they&rsquo;re likely going to hide their head in the sand because they&rsquo;re overwhemed.<sup id="fnref:8"><a href="#fn:8" class="footnote-ref" role="doc-noteref">8</a></sup> Or they may snap at you.</p>
<p>So what do you do when people let you down? I think the best is to speak for your feelings, but to do so in an empathetic way. If you are feeling hurt, don&rsquo;t leave an angry comment. This doesn&rsquo;t mean you have to silence your feelings &ndash; but just own them as your feelings. &ldquo;Hey, I get that you are busy. Still, when I open a PR and nobody answers, it feels like this contribution is not wanted. If that&rsquo;s true, just tell me, I can go elsewhere.&rdquo;<sup id="fnref:9"><a href="#fn:9" class="footnote-ref" role="doc-noteref">9</a></sup></p>
<p>I bet some of you, when you read that last comment, were like &ldquo;oh, heck no&rdquo;. It&rsquo;s scary to talk about how you feel. It takes a lot of courage. But it&rsquo;s effective &ndash; and it can help the maintainer get unblended from whatever part they are in and think about things from your perspective. Maybe they will answer, &ldquo;No, I really want this change, but I am just super busy right now, can you give me 3 months?&rdquo; Or maybe they will say, &ldquo;Actually, you&rsquo;re right, I am not sure this is the right direction. I&rsquo;m sorry that I didn&rsquo;t say so before you put so much work into it.&rdquo; Or <strong>maybe</strong> they won&rsquo;t answer at all, because they&rsquo;re hiding from the github issue thread &ndash; but when they come back and read it much later, they&rsquo;ll reflect on how that made you feel, and try to be more prompt the next time. <strong>Either way, you know that you spoke up for yourself, but did so in a way that they can hear.</strong></p>
<h2 id="empathy-for-ourselves-and-our-own-parts">Empathy for ourselves and our own parts</h2>
<p>This brings me to my final topic. No matter what role we play in an open-source project, or in life, the most important person to have empathy for is <strong>yourself</strong>. Ironically, this is often the hardest. We usually have very high expectations for ourselves, and we don’t cut ourselves much slack. As a maintainer, this might manifest as feeling you have to respond to every comment or task, and feeling bad when you don’t keep up. As a contributor, it might be feeling crappy when people point out bugs in your PR. No matter who we are, it might be kicking ourselves and feeling shame when we overreact in a comment.</p>
<p>In my view, shame is basically never good. Of course I make mistakes, and I regret them. But when I feel <em>shame</em> about them, I am actually focusing inward, focusing on my own mistakes instead of focusing on how I can make it up to the other person or resolve my predicament. It doesn’t actually do anyone any good.</p>
<p>I think there are different ways to experience shame. I know how I experience it. It feels like one of my parts is kicking the crap out of itself. And that really hurts. It hurts so bad that it tends to cause other parts to rise up to try and make it stop. That might be by getting angry at others — “it’s <em>their</em> fault we screwed up!” — or, more common for me, it might be by feeling depressed, withdrawing, and perhaps focusing on some technical project that can make me feel good about myself.</p>
<p>In their classic and highly recommended blog post, <a href="https://blog.burntsushi.net/foss/">My FOSS Story</a>, Andrew Gallant talked about how they deal with an overflowing inbox full of issues, feature requests, and comments:</p>
<blockquote>
<p>The solution that I’ve adopted for this phenomenon is one that I’ve used extremely effectively in my personal life: establish boundaries. Courteously but firmly setting boundaries is one of those magical life hacks that pays dividends once you figure out how to do it. If you don’t know how to do it, then I’m not sure exactly how to learn how to do it unfortunately. But setting boundaries lets you focus on what’s important to you and not what’s important to others.</p>
</blockquote>
<p>It can be really easy to overextend yourself in an open-source project. This could mean, as a maintainer, feeling you have to respond to every comment, fix every bug. Overextending yourself in turn is a great way to become blended with a part, and start acting out some of those older, defensive strategies you have for dealing with stress.</p>
<p>Also, I&rsquo;ve got bad news. You are going to screw up in some way. It might be overextending yourself<sup id="fnref:10"><a href="#fn:10" class="footnote-ref" role="doc-noteref">10</a></sup>. It might be responding poorly. Or pushing for an idea that turns out to be very deeply wrong. When you do that, you have a choice. You can feel shame, or you can extend compassion and empathy to yourself. <strong>It&rsquo;s ok.</strong> Mistakes happen. They are how we learn.</p>
<p>Once you&rsquo;ve gotten past the shame, and realized that making mistakes doesn&rsquo;t make you bad, you can start to think about repair. OK, so you messed up. What can you do about it? Maybe nothing is needed. Or maybe you need to go and undo some of what you did. Or maybe you have to go and tell some people that what they are doing is not ok. Either way, compassion and empathy for yourself is how you will get there.</p>
<h2 id="on-the-limits-of-my-own-experience">On the limits of my own experience</h2>
<p>Before I go, I want to take a moment to acknowledge the limits of my own experience. I am a cis, white male, and I think in this post it shows. When I encounter antipathy, it tends to be targeted at individual things I have done or ideas I am espousing. At most, it might come about because of the role I am playing. I don’t encounter conscious or unconscious bias on the basis of my race, gender, sexual orientation, or any other such thing. This gives me a lot of luxury. For example, for the most part, I can take a rude comment and I can usually find an underlying technical point to focus on in my response. This is not true for all maintainers. In writing this post, I thought a lot about how the dynamics of open source seem almost perfectly designed<sup id="fnref:11"><a href="#fn:11" class="footnote-ref" role="doc-noteref">11</a></sup> to be exclusive to people who are not from groups deemed “high status” by society.</p>
<p>Rust has a pretty uneven track record here. There are projects that do better. Improving our processes to take better account of how they feel for participants is definitely a necessary step, along with other things. One thing I am convinced of: the more people that get involved in Rust &ndash; <strong>and especially the more distinct backgrounds and experiences those people have</strong> &ndash; the better it becomes. Rust is always trying to achieve 6 (previously) impossible things before breakfast, and we need all the ideas we can get.<sup id="fnref:12"><a href="#fn:12" class="footnote-ref" role="doc-noteref">12</a></sup></p>
<h2 id="be-gentle-with-each-other">Be gentle with each other</h2>
<p>If could I have just one wish, it would be this bastardized quote from the great Bill and Ted:</p>
<p><img src="https://smallcultfollowing.com/babysteps/
/assets/2023-09-27-bill-and-ted.jpg" alt="Be gentle with each other"></p>
<p>We’ve talked a lot about empathy and how it comes into play, but really, in my mind, it all boils down to being <em>gentle</em> when somebody slips up. Note that being gentle doesn&rsquo;t mean you can&rsquo;t also be real and authentic about how you felt. We talked earlier about <a href="https://en.wikipedia.org/wiki/I-message">I-messages</a> &ndash; by speaking plainly about how somebody made you feel, you can deliver a message that is both gentle and yet incredibly powerful. To me, the key is not to make assumptions about what&rsquo;s going on for other people. You can never know their motivations. You can make guesses, but they&rsquo;re always based on incomplete information.</p>
<p>Does this mean I think we should all go running around saying &ldquo;when you do X, I felt like you were trying to ruin the project?&rdquo; Well, not really, although I think that would be an improvement. Even better though would be to stop and think, <em>wait, why would they be trying to ruin the project?</em> Instead of assuming what other people are doing, tell them how they are making you feel. Maybe say, &ldquo;when you do X, I feel like you are saying my use case doesn&rsquo;t matter&rdquo;. Or, better yet, say &ldquo;when you do X, I will no longer be able to do Y, which I find really valuable&rdquo;. I predict this is much more likely to lead to a constructive discussion.</p>
<p>It&rsquo;s important to remember that the choice of words can have strong impact, too. For me, words like <em>ruin</em> or phrases like <em>dumpster fire</em>, <em>shitshow</em>, etc, can be quite triggering all on their own. I&rsquo;m not always consistent on this. I&rsquo;ve noticed that I sometimes use strong, colorful language because I think it&rsquo;s funny. But I&rsquo;ve also noticed that when other people do it, I can get pretty upset (&ldquo;I know that code is not the best, but it&rsquo;s worked for the last 3 years dang it.&rdquo;).</p>
<p>I think you can boil all of this down to <strong>be precise and accurate when you communicate</strong>. It&rsquo;s not accurate to say &ldquo;you are trying to ruin the project&rdquo;. You can&rsquo;t know that. It is accurate to talk about what you feel and why you feel it. It&rsquo;s also not accurate to say something is a dumpster fire, but it is accurate to call out shortcomings and concerns.</p>
<p>Anyway, I&rsquo;m done giving advice. I&rsquo;m no expert here, just one more person trying to learn and do the best I can. What I can say with confidence is that the things I&rsquo;m talking here have really helped me personally in approaching difficult situations in my life, and I hope that they&rsquo;ll help some of you too!</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>I bought this book when it first came out, read a bit of it, and then thought of it more as a reference — a great book for getting clear, distinguished definitions that help to elucidate the subtleties of human emotion. But when I revisited it to prepare for this talk, I was surprised to find it was much more “front-to-back” readable than I thought, and carried a lot of hidden wisdom.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Though I think people feeling good and better is always a <em>consequence</em> of having encountered someone else empathetic.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>By none other than <a href="https://en.wikipedia.org/wiki/Jay_Earley">Jay Earley</a>, inventer of the <a href="https://en.wikipedia.org/wiki/Earley_parser">Earley parser</a>! This guy is my hero.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>And I say this as a cis white man, which means I don’t even have to deal with shit resulting from people’s conscious or unconscious bias.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>This is one reason I don’t personally like fast moving threads and discussions, and I often limit the venues where I will participate. I need a bit of time to sit with things and process them.&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p>It’s worth highlighting that the key points they are trying to make are <em>not always</em> technical. Re-reading Aaron Turon’s <a href="http://aturon.github.io/tech/2018/05/25/listening-part-1/">Listening and Trust</a> posts for this series, I was reminded of glaebhoerl’s <a href="https://www.reddit.com/r/rust/comments/2qmeeq/rfc_rename_intuint_to_intxuintx/cn8ugag/">pivotal comment</a> that articulated very well their frustration at the Rust maintainer’s sense of entitlement and superiority, and the reasons for it. As glaebhoerl identified so clearly, it wasn’t so much the technical decision that was the problem — though I think on balance it was the wrong call, it was a debatable point — as the manner of engagement.&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:7">
<p>Like when Disney canceled Owl House without even <strong>asking me</strong>. WHAT GIVES DISNEY.&#160;<a href="#fnref:7" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:8">
<p>For example, I&rsquo;ve been ignoring messages in the Salsa Zulip for a bit, and feeling bad about how I just don&rsquo;t have the time to focus on that project right now. I&rsquo;m sorry y&rsquo;all and I do still expect to come back to Salsa 2022 (which, alas, will clearly not ship in 2022 &ndash; ah well, I knew the risks when I put a year into the name).&#160;<a href="#fnref:8" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:9">
<p>This structure, &ldquo;when you do X, I feel Y&rdquo;, is called an <a href="https://en.wikipedia.org/wiki/I-message">I-message</a>. It&rsquo;s surprisingly hard to do it right. It&rsquo;s easy to make something that sounds like an I-message, but isn&rsquo;t. For example, &ldquo;When you closed this PR without commenting, it showed me I am not welcome here&rdquo; is very different from &ldquo;When you closed this PR without commenting, it made me feel like I am not welcome here&rdquo;. The first one is not an I-message. It&rsquo;s telling someone else how they feel. The second one is telling someone else how they made <em>you</em> feel. There&rsquo;s a very good chance those two statements would land quite differently.&#160;<a href="#fnref:9" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:10">
<p>Unless, perhaps, you are Andrew Gallant, who from what I can see is one supremely well balanced individual. :)&#160;<a href="#fnref:10" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:11">
<p>This of course is what people mean when they talk about systemic racism, or at least how I understand it: it’s not that open source or most other things were designed intentionally to reinforce bias, but the structures of our society are setup so that if you don’t <em>actively work to counteract bias</em>, you wind up playing into it.&#160;<a href="#fnref:11" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:12">
<p>I always think of Jessica Lord&rsquo;s inspirational blog post <a href="http://jlord.us/blog/osos-talk.html">Privilege, Community, and Open source</a>, which sadly appears to be offline, but you can <a href="https://web.archive.org/web/20220201181735/https://jlord.us/blog/osos-talk.html">read it on the web-archive</a>.&#160;<a href="#fnref:12" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/pinned/yes" term="yes" label="yes"/></entry><entry><title type="html">Polonius revisited, part 1</title><link href="https://smallcultfollowing.com/babysteps/blog/2023/09/22/polonius-part-1/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2023/09/22/polonius-part-1/</id><published>2023-09-22T00:00:00+00:00</published><updated>2023-09-22T16:32:40-04:00</updated><content type="html"><![CDATA[<p><a href="https://github.com/lqd/">lqd</a> has been doing awesome work driving progress on <a href="https://github.com/rust-lang/polonius/">polonius</a>. He&rsquo;s authoring an <a href="https://github.com/rust-lang/blog.rust-lang.org/pull/1147">update for Inside Rust</a>, but the TL;DR is that, with his latest PR, we&rsquo;ve reimplemented the traditional Rust borrow checker in a more polonius-like style. We are working to iron out the last few performance hiccups and thinking about replacing the existing borrow checker with this new re-implementation, which is effectively a no-op from a user&rsquo;s perspective (including from a performance perspective). This blog post walks through that work, describing how the new analysis works at a high-level. I plan to write some follow-up posts diving into how we can extend this analysis to be more precise (while hopefully remaining efficient).</p>
<h2 id="what-is-polonius">What is Polonius?</h2>
<p>Polonius is one of those long-running projects that are finally starting to move again. From an end user&rsquo;s perspective, the key goal is that we want to accept functions like so-called <a href="https://rust-lang.github.io/rfcs/2094-nll.html#problem-case-3-conditional-control-flow-across-functions">Problem Case #3</a>, which was originally a goal of NLL but eventually cut from the deliverable. From my perspective, though, I&rsquo;m most excited about Polonius as a stepping stone towards an analysis that can support internal references and self borrows.</p>
<p>Polonius began its life as an <a href="http://smallcultfollowing.com/babysteps/blog/2018/04/27/an-alias-based-formulation-of-the-borrow-checker/">alternative formulation of the borrow checker rules</a> defined in Datalog. The key idea is to switch the way we do the analysis. Whereas NLL thinks of <code>'r</code> as a <strong>lifetime</strong> consisting of a set of program points, in polonius, we call <code>'r</code> an <strong>origin</strong> containing a set of <strong>loans</strong>. In other words, rather than tracking the parts of the program where a reference will be used, we track the places that the reference may have come from. For deeper coverage of Polonius, I recommend <a href="https://www.youtube.com/watch?v=_agDeiWek8w">my talk at Rust Belt Rust from (egads) 2019</a> (<a href="https://nikomatsakis.github.io/rust-belt-rust-2019/">slides here</a>).</p>
<h2 id="running-example">Running example</h2>
<p>In order to explain the analyses, I&rsquo;m going to use this running example. One thing you&rsquo;ll note is that the lifetimes/origins in the example are written as numbers, like <code>'0</code> and <code>'1</code>. This is because, when we start the borrow check, we haven&rsquo;t computed lifetimes/origins yet &ndash; that is the job of the borrow check! So, we first go and create synthetic <em>inference variables</em> (just like an algebraic variable) to use as placeholders throughout the computation. Once we&rsquo;re all done, we&rsquo;ll have actual values we could plug in for them &ndash; in the case of polonius, those values are sets of loans (each loan is a <code>&amp;</code> expression, more or less, that appears somewhere in the program).</p>
<p>Here is our example. It contains two loans, L0 and L1, of <code>x</code> and <code>y</code> respectively. There are also four assignments:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">22</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">44</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">p</span>: <span class="kp">&amp;</span><span class="na">&#39;</span><span class="mi">0</span><span class="w"> </span><span class="kt">u32</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">x</span><span class="p">;</span><span class="w"> </span><span class="c1">// Loan L0, borrowing `x`
</span></span></span><span class="line"><span class="cl"><span class="n">y</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">                  </span><span class="c1">// (A) Mutate `y` -- is this ok?
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">q</span>: <span class="kp">&amp;</span><span class="na">&#39;</span><span class="mi">1</span><span class="w"> </span><span class="kt">u32</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">y</span><span class="p">;</span><span class="w"> </span><span class="c1">// Loan L1, borrowing `y`
</span></span></span><span class="line"><span class="cl"><span class="k">if</span><span class="w"> </span><span class="n">something</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">p</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">q</span><span class="p">;</span><span class="w">               </span><span class="c1">// `p` now points at `y`
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">x</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">              </span><span class="c1">// (B) Mutate `x` -- is this ok?
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">y</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">              </span><span class="c1">// (C) Mutate `y` -- is this ok?
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">y</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">                  </span><span class="c1">// (D) Mutate `y` -- is this ok?
</span></span></span><span class="line"><span class="cl"><span class="n">read_value</span><span class="p">(</span><span class="n">p</span><span class="p">);</span><span class="w">           </span><span class="c1">// use `p` again here
</span></span></span></code></pre></div><p><a href="https://play.rust-lang.org/?version=stable&amp;mode=debug&amp;edition=2021&amp;gist=8bfd49b522670b37a4e5b0d00bcc6209">Today in Rust</a>, we get two errors (C and D). If you were to run this example with <a href="https://github.com/RalfJung/minirust">MiniRust</a>, though, you would find that only D can actually cause Undefined Behavior. At point C, we mutate <code>y</code>, but the only variable that references <code>y</code> is <code>q</code>, and it will never be used again. The borrow checker today reports an error because its overly conservative. Polonius, on the other hand, gets that case correct.</p>
<table>
  <thead>
      <tr>
          <th>Location</th>
          <th>Existing borrow checker</th>
          <th>Polonius</th>
          <th>MiniRust</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>A</td>
          <td>&#x2714;&#xfe0f;</td>
          <td>&#x2714;&#xfe0f;</td>
          <td>OK</td>
      </tr>
      <tr>
          <td>B</td>
          <td>&#x2714;&#xfe0f;</td>
          <td>&#x2714;&#xfe0f;</td>
          <td>OK</td>
      </tr>
      <tr>
          <td>C</td>
          <td>&#x274c;</td>
          <td>&#x2714;&#xfe0f;</td>
          <td>OK</td>
      </tr>
      <tr>
          <td>D</td>
          <td>&#x274c;</td>
          <td>&#x274c;</td>
          <td>Can cause UB, if <code>true</code> branch is taken</td>
      </tr>
  </tbody>
</table>
<h2 id="reformulating-the-existing-borrow-check-à-la-polonius">Reformulating the existing borrow check à la polonius</h2>
<p>This blog post is going describe the existing borrow checker, but reformulated in a polonius-like style. This will make it easier to see how polonius is different in the next post. The idea of doing this reformulation came about when implementing the borrow checker in <a href="https://github.com/rust-lang/a-mir-formality">a-mir-formality</a><sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>. At first, we weren&rsquo;t sure if it was equivalent, but <a href="https://github.com/lqd/">lqd</a> verified it experimentally by testing it against the rustc test suite, where it matches the behavior 100% (<a href="https://github.com/lqd/">lqd</a> is also going to test against crater).</p>
<p>The borrow check analysis is a combination of three things, which we will cover in turn:</p>
<pre class="mermaid">flowchart TD
  ConstructMIR --> LiveVariable
  ConstructMIR --> OutlivesGraph
  LiveVariable --> LiveLoanDataflow
  OutlivesGraph --> LiveLoanDataflow
  ConstructMIR["Construct the MIR"]
  LiveVariable["Compute the live variables"]
  OutlivesGraph["Compute the outlives graph"]
  LiveLoanDataflow["Compute the active loans at a given point"]
  </pre>
<h2 id="construct-the-mir">Construct the MIR</h2>
<p>The borrow checker these days operates on <a href="https://rustc-dev-guide.rust-lang.org/mir/index.html">MIR</a><sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>. MIR is basically a very simplified version of Rust where each statement is broken down into rudimentary statements. Our program is already so simple that the MIR basically looks the same as the original program, except for the fact that it&rsquo;s structured into a <a href="https://en.wikipedia.org/wiki/Control-flow_graph">control-flow graph</a>. The MIR would look roughly like this (simplified):</p>
<pre class="mermaid">flowchart TD
  Intro --> BB1
  Intro["let mut x: i32<br>let mut y: i32<br>let mut p: &'0 i32<br>let mut q: &'1 i32"]
  BB1["p = &x;<br>y = y + 1;<br>q = &y;<br>if something goto BB2 else BB3"]
  BB1 --> BB2
  BB1 --> BB3
  BB2["p = q;<br>x = x + 1;<br>"]
  BB3["y = y + 1;"]
  BB2 --> BB4;
  BB3 --> BB4;
  BB4["y = y + 1;<br>read_value(p);<br>"]

  classDef default text-align:left,fill-opacity:0;
  </pre>
<p>Note that MIR begins with the types for all the variables; control-flow constructs like <code>if</code> get transformed into graph nodes called <em>basic blocks</em>, where each basic block contains only simple, straightline statements.</p>
<h2 id="compute-the-live-origins">Compute the live origins</h2>
<p>The first step is to compute the set of <em>live origins</em> at each program point. This is precisely the same as <a href="https://rust-lang.github.io/rfcs/2094-nll.html#liveness">it was described in the NLL RFC</a>. This is very similar to the classic liveness computation that is taught in a typical compiler course, but with one key difference. We are not computing live <em>variables</em> but rather live <em>origins</em> &ndash; the idea is roughly that the <em>live origins</em> are equal to the origins that appear in the types of the live <em>variables</em>:</p>
<pre tabindex="0"><code>LiveOrigins(P) = { O | O appears in the type of some variable V live at P }
</code></pre><p>The actual computation is slightly more subtle: when variables go out of scope, we take into account the rules from <a href="https://rust-lang.github.io/rfcs/1327-dropck-param-eyepatch.html">RFC #1327</a> to figure out precisely which of their origins may be accessed by the <code>Drop</code> impl. But I&rsquo;m going to skip over that in this post.</p>
<p>Going back to our example, I&rsquo;ve added comments which origins would be live at various points of interest:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">22</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">44</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">p</span>: <span class="kp">&amp;</span><span class="na">&#39;</span><span class="mi">0</span><span class="w"> </span><span class="kt">u32</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">x</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">y</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">q</span>: <span class="kp">&amp;</span><span class="na">&#39;</span><span class="mi">1</span><span class="w"> </span><span class="kt">u32</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">y</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// Here both `p` and `q` may be used later,
</span></span></span><span class="line"><span class="cl"><span class="c1">// and so the origins in their types (`&#39;0` and `&#39;1`)
</span></span></span><span class="line"><span class="cl"><span class="c1">// are live.
</span></span></span><span class="line"><span class="cl"><span class="k">if</span><span class="w"> </span><span class="n">something</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Here, only the variable `q` is live.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// `p` is dead because its current value is about
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// to be overwritten. As a result, the only live
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// origin is `&#39;1`, since it appears in `q`&#39;s type.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">p</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">q</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">x</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">y</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// Here, only the variable `p` is live
</span></span></span><span class="line"><span class="cl"><span class="c1">// (`q` is never used again),
</span></span></span><span class="line"><span class="cl"><span class="c1">// and so only the origin `&#39;0` is live.
</span></span></span><span class="line"><span class="cl"><span class="n">y</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">read_value</span><span class="p">(</span><span class="n">p</span><span class="p">);</span><span class="w">
</span></span></span></code></pre></div><h2 id="compute-the-subset-graph">Compute the subset graph</h2>
<p>The next step in borrow checking is to run a type check across the MIR. MIR is effectively a very simplified form of Rust where statements are heavily desugared and there is a lot less type inference. There is, however, a lot of <em>lifetime</em> inference &ndash; basically when NLL starts <strong>every</strong> lifetime is an inference variable.</p>
<p>For example, consider the <code>p = q</code> assignment in our running example:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">p</span>: <span class="kp">&amp;</span><span class="na">&#39;</span><span class="mi">0</span><span class="w"> </span><span class="kt">u32</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">x</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">y</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">q</span>: <span class="kp">&amp;</span><span class="na">&#39;</span><span class="mi">1</span><span class="w"> </span><span class="kt">u32</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">y</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">if</span><span class="w"> </span><span class="n">something</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">p</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">q</span><span class="p">;</span><span class="w"> </span><span class="c1">// &lt;-- this assignment
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span></code></pre></div><p>To type check this, we take the type of <code>q</code> (<code>&amp;'1 u32</code>) and require that it is a subtype of the type of <code>p</code> (<code>&amp;'0 u32</code>):</p>
<pre tabindex="0"><code>&amp;&#39;1 u32 &lt;: &amp;&#39;0 u32
</code></pre><p>As described in <a href="https://rust-lang.github.io/rfcs/2094-nll.html?highlight=nll#subtyping">the NLL RFC</a>, this subtyping relation holds if <code>'1: '0</code>. In NLL, we called this an <em>outlives relation</em>. But in polonius, because <code>'0</code> and <code>'1</code> are origins representing <em>sets of loans</em>, we call it a <strong>subset relation</strong>. In other words, <code>'1: '0</code> could be written <code>'1 ⊆ '0</code>, and it means that whatever loans <code>'1</code> may be referencing, <code>'0</code> may reference too. Whatever final values we wind up with for <code>'0</code> and <code>'1</code> will have to reflect this constraint.</p>
<p>We can view these subset relations as a graph, where <code>'1: '0</code> means there is an edge <code>'1 --⊆--&gt; '0</code>. In the borrow checker today, this graph is <strong>flow insensitive</strong>, meaning that there is one graph for the entire function. As a result, we are going to get a graph like this:</p>
<pre class="mermaid">flowchart LR
  L0 --"⊆"--> Tick0
  L1 --"⊆"--> Tick1
  Tick1 --"⊆"--> Tick0
  
  L0["{L0}"]
  L1["{L1}"]
  Tick0["'0"]
  Tick1["'1"]

  classDef default text-align:left,fill:#ffffff;
  </pre>
<p>You can see that <code>'0</code>, the origin that appears in <code>p</code>, can be reached from both loan <code>L0</code> and loan <code>L1</code>. That means that it could store a reference to <em>either</em> <code>x</code> or <code>y</code>, in short. In contrast, <code>'1</code> (<code>q</code>) can only be reached from L1, and hence can only store a reference to <code>y</code>.</p>
<h2 id="active-loans">Active loans</h2>
<p>There is one last piece to complete the borrow checker, which is computing the <strong>active loans</strong>. Active loans determine the errors that get reported. The idea is that, if there is an active loan of a place <code>a.b.c</code>, then accessing <code>a.b.c</code> may be an error, depending on the kind of loan/access.</p>
<p>Active loans build on the liveness analysis as well as the subset graph. The basic idea is that a loan is active at a point P if there is a path from the borrow that created the loan to P where, for each point along the path&hellip;</p>
<ul>
<li>there is some live variable that may reference the loan
<ul>
<li>i.e., there is a live origin <code>O</code> at <code>P</code> where <code>L ∈ O</code>. <code>L ∈ O</code> means that there is a path in the subset graph from the loan <code>L</code> to the origin <code>O</code>.</li>
</ul>
</li>
<li>the place expression that was borrowed (here, <code>x</code>) is not reassigned
<ul>
<li>this isn&rsquo;t relevant to the current example, but the idea is that you can borrow the referent of a pointer, e.g., <code>&amp;mut *tmp</code>. If you then later change <code>tmp</code> to point somewhere else, then the old loan of <code>*tmp</code> is no longer relevant, because it&rsquo;s pointing to different data than the current value of <code>*tmp</code>.</li>
</ul>
</li>
</ul>
<h3 id="implementing-using-dataflow">Implementing using dataflow</h3>
<p>In the compiler, we implement the above as a <a href="https://en.wikipedia.org/wiki/Data-flow_analysis"><strong>dataflow analysis</strong></a>. The value at any given point is the set of active loans. We <em>gen</em> a loan (add it to the value) when it is issued, and we <em>kill</em> a loan at a point P if either (1) the loan is not a member of the origins of any live variables; (2) the path borrowed by the loan is overwritten.</p>
<h4 id="active-loans-on-entry-to-the-function">Active loans on entry to the function</h4>
<p>Let&rsquo;s walk through our running example. To start, look at the first basic block:</p>
<pre class="mermaid">flowchart TD
  Start["..."]
  BB1["<b><i>// Active loans: {}</i></b>
       p = &x; <b><i>// Gen: L0</i></b> -- loan issued
       <b><i>// Active loans: {L0}</i></b>
       y = y + 1;
       q = &y; <b><i>// Gen L1</i></b> -- loan issued
       <b><i>// Active loans {L0, L1}</i></b>
       if something goto BB2 else BB3
  "]
  BB2["..."]
  BB3["..."]
  BB4["..."]

  Start --> BB1
  BB1 --> BB2
  BB1 --> BB3
  BB2 --> BB4
  BB3 --> BB4

  classDef default text-align:left,fill:#ffffff;
  classDef highlight text-align:left,fill:yellow;
  class BB1 highlight
  </pre>
<p>This block is the start of the function, so the set of active loans starts out as empty. But then we encounter two <code>&amp;x</code> statements, and each of them is the <strong>gen</strong> site for a loan (<code>L0</code> and <code>L1</code> respectively). By the end of the block, the active loan set is <code>{L0, L1}</code>.</p>
<h4 id="active-loans-on-the-true-branch">Active loans on the &ldquo;true&rdquo; branch</h4>
<p>The next interesting point is the &ldquo;true&rdquo; branch of the if:</p>
<pre class="mermaid">flowchart TD
  Start["
    ...
    let mut q: &'1 i32;
    ...
  "]
  BB1["..."]
  BB2["
      <b><i>// Kill L0 -- not part of any live origin</i></b>
      <b><i>// Active loans {L1}</i></b>
      p = q;
      x = x + 1;
  "]
  BB3["..."]
  BB4["..."]
 
  Start --> BB1
  BB1 --> BB2
  BB1 --> BB3
  BB2 --> BB4
  BB3 --> BB4
 
  classDef default text-align:left,fill:#ffffff;
  classDef highlight text-align:left,fill:yellow;
  class BB2 highlight
  </pre>
<p>The interesting thing here is that, on entering the block, there is a <strong>kill</strong> of L0. This is because the only live reference on entry to the block is <code>q</code>, as <code>p</code> is about to be overwritten. As the type of <code>q</code>  is <code>&amp;'1 i32</code>, this means that the live origins on entry to the block are <code>{'1}</code>. Looking at the subset graph we saw earlier&hellip;</p>
<pre class="mermaid">flowchart LR
  L0 --"⊆"--> Tick0
  L1 --"⊆"--> Tick1
  Tick1 --"⊆"--> Tick0
  
  L0["{L0}"]
  L1["{L1}"]
  Tick0["'0"]
  Tick1["'1"]

  class L1 trace
  class Tick1 trace

  classDef default text-align:left,fill:#ffffff;
  classDef trace text-align:left,fill:yellow;
  </pre>
<p>&hellip;we can trace the transitive predecessors of <code>'1</code> to see that it contains only <code>{L1}</code> (I&rsquo;ve highlighted those predecessors in yellow in the graph). This means that there is no live variable whose origins contains <code>L0</code>, so we add a kill for <code>L0</code>.</p>
<h4 id="no-error-on-true-branch">No error on <code>true</code> branch</h4>
<p>Because the only active loan is L1, and L1 borrowed <code>y</code>, the <code>x = x + 1</code> statement is accepted. This is a really interesting result! It illustrates how the idea of <em>active loans</em> restores some flow sensitivity to the borrow check.</p>
<p>Why is it so interesting? Well, consider this. At this point, the variable <code>p</code> is live. The variable <code>p</code> contains the origin <code>'0</code>, and if we look at the subset graph, <code>'0</code> contains both L0 and L1. So, based purely on the subset graph, we would expect modifying <code>x</code> to be an error, since it is borrowed by L0. And yet it&rsquo;s not!</p>
<p>This is because the <em>active loan</em> analysis noticed that, although in theory <code>x</code> may reference <code>L0</code>, it definitely doesn&rsquo;t at this point.</p>
<h4 id="active-loans-on-the-false-branch">Active loans on the <code>false</code> branch</h4>
<p>In contrast, if we look at the &ldquo;false&rdquo; branch of the if:</p>
<pre class="mermaid">flowchart TD
  Start["
    ...
    let mut p: &'0 i32;
    ...
  "]
  BB1["..."]
  BB2["..."]
  BB3["
      <b><i>// Active loans {L0}, {L1}</i></b>
      y = y + 1;
  "]
  BB4["..."]
 
  Start --> BB1
  BB1 --> BB2
  BB1 --> BB3
  BB2 --> BB4
  BB3 --> BB4
 
  classDef default text-align:left,fill:#ffffff;
  classDef highlight text-align:left,fill:yellow;
  class BB3 highlight
  </pre>
<h4 id="false-error-on-the-false-branch">False error on the <code>false</code> branch</h4>
<p>This path is also interesting: there is only one live variable, <code>p</code>. If you trace the code by hand, you can see that <code>p</code> could only refer to L0 (<code>x</code>) here. And yet the analysis concludes that we have two active loans: L0 and L1. This is because it is looking at the subset graph to determine what <code>p</code> may reference, and that graph is <em>flow insensitive</em>. So, since <code>p</code> may reference L1 at <em>some</em> point in the program, and we haven&rsquo;t yet seen references to L1 go completely dead, we assume that <code>p</code> may reference L1 here. This leads to a false error being reported when the user does <code>y = y + 1</code>.</p>
<h4 id="active-loans-on-the-final-block">Active loans on the final block</h4>
<p>Now let&rsquo;s look at the final block:</p>
<pre class="mermaid">flowchart TD
  Start["
    ...
    let mut p: &'0 i32;
    ...
  "]
  BB1["..."]
  BB2["..."]
  BB3["..."]
  BB4["
        <b><i>// Active loans {L0}, {L1}</i></b>
        y = y + 1;
        read_value(p);
  "]
 
  Start --> BB1
  BB1 --> BB2
  BB1 --> BB3
  BB2 --> BB4
  BB3 --> BB4
 
  classDef default text-align:left,fill:#ffffff;
  classDef highlight text-align:left,fill:yellow;
  class BB4 highlight
  </pre>
<p>At this point, there is one live variable (<code>p</code>) and hence one live origin (<code>'0</code>); the subset graph tells us that <code>p</code> may reference both <code>L0</code> and <code>L1</code>, so the set of active loans is <code>{L0, L1}</code>. This is correct: depending on which path we took, <code>p</code> may refer to either <code>L0</code> or <code>L1</code>, and hence we flag a (correct) error when the user attempts to modify <code>y</code>.</p>
<h2 id="kills-for-reassignment">Kills for reassignment</h2>
<p>Our running example showed one reason that loans get killed when there are no more live references to them. This most commonly happens when you create a short-lived reference and then stop using it. But there is another way to get a kill, which happens from reassignment. Consider this example:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">List</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">next</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="nb">Box</span><span class="o">&lt;</span><span class="n">List</span><span class="o">&gt;&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">print_all</span><span class="p">(</span><span class="k">mut</span><span class="w"> </span><span class="n">p</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">List</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">loop</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;</span><span class="si">{}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="n">p</span><span class="p">.</span><span class="n">data</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">n</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">p</span><span class="p">.</span><span class="n">next</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">p</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">n</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">break</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>I&rsquo;m not going to walk through how this is borrow checked in detail here, but let me just point out what makes it interesting. In this loop, the code first borrows from <code>p</code> and then assigns that result to <code>p</code>. This means that, if you just look at the <em>subset graph</em>, on the next iteration around the loop, there would be an active loan of <code>p</code>. However, <a href="https://play.rust-lang.org/?version=stable&amp;mode=debug&amp;edition=2021&amp;gist=076383dd805aa00844ac679b6fc8c2cb">this code compiles</a> &ndash; how does that work? The answer is that when we do <code>p = n</code>, we are mutating <code>p</code>, which means that, when we borrow from <code>p</code> on the next iteration, we are actually borrowing from a <em>previous node</em> than we borrowed from in the first iteration. So everything is fine. The reason the borrow checker is able to conclude this is that it kills the loan of <code>p.next</code> when it sees that <code>p</code> is assigned to. <a href="https://rust-lang.github.io/rfcs/2094-nll.htmlborrow-checker-phase-1-computing-loans-in-scope">This is discussed in the NLL RFC in more detail.</a></p>
<h2 id="conclusion">Conclusion</h2>
<p>That brings us to the end of part 1! In this post, we covered how you can describe the existing borrow check in a more polonius-like style. We also uncovered an interesting quirk in how the borrow checker is formulated. It uses a <em>location insensitive</em> alias analysis (the subset graph) but completes that with a dataflow propagation to track active loans. Together, this makes it more expressive. This wasn&rsquo;t, however, the original plan with NLL. Originally, the subset graph was meant to be flow sensitive. Extending the subset graph to be flow sensitive is basically the heart of polonius. I&rsquo;ve got some thoughts on how we might do that and I&rsquo;ll be getting to that in later posts. I do want to say in passing though that doing all of this framing is also making me wonder &ndash; is it really necessary to combine a type check <em>and</em> the dataflow check? Can we frame the borrow checker (probably the more precise variants we&rsquo;ll be getting to in future posts) in a more unified way? Not sure yet!</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>You won&rsquo;t find this code in the current version of a-mir-formality; it&rsquo;s since been rewritten a few times and the current version hasn&rsquo;t caught up yet.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>The origin of the MIR is actually an interesting story. As documented in <a href="https://rust-lang.github.io/rfcs/1211-mir.html">RFC #1211</a>,&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/polonius-revisited" term="polonius-revisited" label="Polonius revisited"/></entry><entry><title type="html">New Layout, and now using Hugo!</title><link href="https://smallcultfollowing.com/babysteps/blog/2023/09/19/new-layout/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2023/09/19/new-layout/</id><published>2023-09-19T00:00:00+00:00</published><updated>2023-09-19T05:30:44-04:00</updated><content type="html"><![CDATA[<p>Some time ago I wrote about how I wanted to improve how my blog works. I recently got a spate of emails about this &ndash; thanks to all of you! And a particular big thank you to Luna Razzaghipour, who went ahead and ported the blog over to use Hugo, cleaning up the layout a bit and preserving URLs. It&rsquo;s much appreciated! If you notice something amiss (like a link that doesn&rsquo;t work anymore), I&rsquo;d be very grateful if you opened an issue on the <a href="https://github.com/nikomatsakis/babysteps">babysteps github repo</a>! Thanks!</p>
<p>Hugo seems fast so far, although I will say that figuring out how to use Hugo modules (so that I could preserve the atom feed&hellip;) was rather confusing! But it&rsquo;s all working now (I think!). I&rsquo;m still interested in playing around more with the layout, but overall I think it looks good, and I&rsquo;m happy to have code coloring on the snippets. Hopefully it renders better on mobile too.</p>]]></content></entry><entry><title type="html">Stability without stressing the !@#! out</title><link href="https://smallcultfollowing.com/babysteps/blog/2023/09/18/stability-without-stressing-the-out/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2023/09/18/stability-without-stressing-the-out/</id><published>2023-09-18T00:00:00+00:00</published><updated>2023-09-18T11:04:00-04:00</updated><content type="html"><![CDATA[<p>One of Rust&rsquo;s core principles is <a href="https://doc.rust-lang.org/book/appendix-07-nightly-rust.html#stability-without-stagnation">&ldquo;stability without stagnation&rdquo;</a>. This is embodied by our use of a <a href="https://doc.rust-lang.org/book/appendix-07-nightly-rust.html#choo-choo-release-channels-and-riding-the-trains">&ldquo;release train&rdquo;</a> model, in which we issue a new release every 6 weeks. Release trains make releasing a new release a &ldquo;non-event&rdquo;. Feature-based releases, in contrast, are super stressful! Since they occur infrequently, people try to cram everything into that release, which inevitably makes the release late. In contrast, with a release train, it&rsquo;s not so important to make any particular release &ndash; if you miss one deadline, you can always catch the next one six weeks later. <em>That&rsquo;s the theory, anyway:</em> but I&rsquo;ve observed that, in practice, stabilizing a feature in Rust can still be a pretty stressful process. And the more important the feature, the more stress. This blog post talks over my theories as to why this is the case, and how we can tweak our processes (and our habits) to address it.</p>
<h2 id="tldr">TL;DR</h2>
<p>I like to write, and sometimes my posts get long. Sorry! Let me summarize for you:</p>
<ul>
<li>Stabilization designs in Rust are stressful because they are conflating two distinct things: &ldquo;does the feature do what it is supposed to do&rdquo; (semver-stability) and &ldquo;is the feature ready for general use for all its intended use cases&rdquo; (recommended-for-use).</li>
<li>Open source works incrementally: to complete the polish we want, we need users to encounter the feature; incremental milestones help us do that.</li>
<li>Nightly is effective for getting some kinds of feedback, but not all; in particular, production users and library authors often won&rsquo;t touch it. This gives us less data to work with when making high stakes decisions, and it&rsquo;s a problem.</li>
<li>We should modify our process to distinguish four phases
<ul>
<li><strong>Accepted RFC</strong> &ndash; The team agrees idea is worth implementing, but it may yet be changed or removed. Use at your own risk. (Nightly today)</li>
<li><strong>Preview</strong> &ndash; Team agrees feature is ready for use, but wishes more feedback before committing. We reserve the right to tweak the details, but will not remove functionality without some migration path or workaround. (No equivalent today)</li>
<li><strong>Stable</strong> &ndash; Team agrees feature is done. Semantics will no longer change. Implementation may lack polish and may not yet meet all its intended use cases (but should meet some). (Stable today)</li>
<li><strong>Recommended</strong> &ndash; everyone should use this, it rocks. &#x1f3b8;  (No equivalent today, though some would say stable)</li>
</ul>
</li>
<li>I have an initial proposal for how we could implement these phases for Rust, but I&rsquo;m not sure on the details. The point is more to identify this as a problem and start a discussion on potential solutions, rather than to drive a particular proposal.</li>
</ul>
<h2 id="context">Context</h2>
<p>This post is inspired by years of experience trying to stabilize features. I&rsquo;ve been meaning to write it for a while, but I was influenced most recently by the discussion on the PR to <a href="https://github.com/rust-lang/rust/pull/115822">stabilize async fn in trait and return-position impl trait</a>. I&rsquo;m not intending this blog post to be an argument either way on that particular discussion, although I will be explaining my POV, which certainly has bearing on the outcome.</p>
<p>I will zoom out though and say that I think the Rust project needs to think about the whole &ldquo;feature design lifecycle&rdquo;. This has been a topic for me for years &ndash; just search for &ldquo;adventures in consensus&rdquo; on this blog. I think in the past I&rsquo;ve been a bit too ambitious in my proposals<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>, so I&rsquo;m thinking now about how we can move more incrementally. This blog post is one such example.</p>
<h2 id="summary-of-rusts-process-today">Summary of Rust&rsquo;s process today</h2>
<p>Let me briefly summarize the &ldquo;feature lifecycle&rdquo; for Rust today. I&rsquo;ll focus on language features since that&rsquo;s what I know best: this material is also published on the <a href="https://lang-team.rust-lang.org/how_to/propose.html">&ldquo;How do I propose a change to the language&rdquo;</a> page for the lang-team, which I suspect most people don&rsquo;t know exists<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>.</p>
<p>The path is roughly like this:</p>
<ul>
<li><strong>Author an RFC</strong> that outlines the problem to be solved and the key aspects of your solution. The RFC doesn&rsquo;t have to have everything figured out, especially when it comes to the implementation &ndash; but it should describe most everything that a user of the language would have to know. The RFC can include <strong>&ldquo;unresolved questions&rdquo;</strong> that lay out corner cases or things where we need more experience to figure out the right answer.
<ul>
<li>Generally speaking, to avoid undue maintenance burden, we don&rsquo;t allow code to land until there is an accepted RFC. There is an exception though for experienced Rust contributors, who can create an experimental feature gate to do some initial hacking. That&rsquo;s sometimes useful to prove out designs.<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></li>
</ul>
</li>
<li><strong>Complete the implementation</strong> on master. This should force you to work out answers to the all <strong>unresolved questions</strong> that came up in the RFC. Often, having an implementation to work with also leads to other changes in the design. Presuming these are relatively minor, these changes are discussed and approved by the lang team on issues on the rust-lang repository.</li>
<li><strong>Author a stabilization report</strong>, describing precisely what is being stabilized along with how each unresolved question was resolved.</li>
</ul>
<h2 id="observation-stabilization-means-different-things-to-different-people">Observation: Stabilization means different things to different people.</h2>
<p>In a technical sense, stabilization means exactly one thing: the feature is now available on the stable release, and hence <strong>we can no longer make breaking changes to it</strong><sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup>.</p>
<p>But, of course, stabilization also means that the feature is going to be encountered by users. Rust has always prided itself on holding a high bar for polish and quality, as reflected in how easy cargo is to use, our quality error messages, etc. There is always a concern when stabilizing a long-awaited feature that users are going to get excited, try it out, encounter rough edges, and conclude from this that Rust is impossible to use.</p>
<h2 id="observation-open-source-works-incrementally">Observation: Open source works incrementally</h2>
<p>Something I&rsquo;ve come to appreciate over time is that open source is most effective if you work <strong>incrementally</strong>. If you want people to contribute or to provide meaningful feedback, you have to give them something to play with. Once you do that, the pace of progress and polish increases dramatically. It&rsquo;s not magic, it&rsquo;s just people &ldquo;scratching their own itch&rdquo; &ndash; once people have a chance to use the feature, if there is a confusing diagnostic or other similar issue, there&rsquo;s a good chance that somebody will take a shot at addressing it.</p>
<p>In fact, speaking of diagnostics, it&rsquo;s pretty hard to write a good diagnostic <em>until</em> you&rsquo;ve thrown the feature at users. Often it&rsquo;s not obvious up front what is going to be confusing. If you&rsquo;ve ever watched <a href="https://github.com/estebank">Esteban</a> at work, you&rsquo;ll know that he scans all kinds of sources (github issues, twitter or whatever it&rsquo;s called now, etc) to see the kinds of confusions that people are having and to look for ideas on how to explain them better.</p>
<h2 id="observation-incremental-progress-boosts-morale">Observation: Incremental progress boosts morale</h2>
<p>The other big impact of working incrementally is for morale. If you&rsquo;ve ever tried to push a big feature over the line, you&rsquo;ll know that achieving milestones along the way is <strong>crucial</strong>. There&rsquo;s a huge difference between trying to get everything perfect before you can ship and saying: &ldquo;ok, this part is done, let&rsquo;s get it in people&rsquo;s hands, and then go focus on the next one&rdquo;. This is both because it&rsquo;s good to have the satisfaction of a job well done, and because stabilization is the only point at which we can <strong>truly</strong> end discussion. Up until stabilization is done, it&rsquo;s always possible to stop and revisit old decisions.<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup></p>
<h2 id="observation-working-incrementally-has-a-cost">Observation: Working incrementally has a cost</h2>
<p>Obviously, I am a big of working incrementally, but I won&rsquo;t deny that it has a cost. For every person who encounters a bad diagnostic and gets inspired to open a PR, there are a lot more who will get confused. Some portion of them will walk away, concluding &ldquo;Rust is too confusing&rdquo;. That&rsquo;s a problem.</p>
<h2 id="observation-a-polished-feature-has-a-lot-of-moving-parts">Observation: A polished feature has a lot of moving parts</h2>
<p>A polished feature in Rust today has a lot of moving parts&hellip;</p>
<ul>
<li>a thoughtful design</li>
<li>a stable, bug free implementation</li>
<li>documentation in the Rust reference</li>
<li>quality error messages</li>
<li>tooling support, such as rustfmt, rustdoc, IDE, etc</li>
</ul>
<p>&hellip;and we&rsquo;d like to add more. For example, we are working on various Rust formalizations (<a href="https://github.com/RalfJung/minirust">MiniRust</a>, <a href="https://github.com/rust-lang/a-mir-formality">a-mir-formality</a>) and talking about upgrading the Rust reference into a normative specification.</p>
<h2 id="observation-distinct-skillsets-are-required-to-polish-a-feature">Observation: Distinct skillsets are required to polish a feature</h2>
<p>One interesting detail is that, often, completeing a polished feature requires the work of different people with different skillsets, which in turn means the involvement of many distinct Rust teams &ndash; in fact, when it comes to development tooling, this can mean the involvement of distinct projects that aren&rsquo;t even part of the Rust org!</p>
<p>Just looking at language features, the <em>design</em>, for example, belongs to the lang-team, and often completes relatively early through the RFC process. The implementation is (typically) the compiler team, but often also more specialized teams and groups, like the types team or the diagnostics working group; RFCs can sometimes languish for a long time before being implemented. Documentation meanwhile is driven by the <a href="https://www.rust-lang.org/governance/teams/lang#lang-docs%20team">lang-docs</a> team (for language features, anyway). Once that is done, the rustfmt, rustdoc, and IDE vendors also have work to do incorporating the new feature.</p>
<p>One of the challenges to open-source development is coordinating all of these different aspects. Open source development tends to be <em>opportunistic</em> &ndash; you don&rsquo;t have dedicated resources available, so you have to do a balancing act where you adapt the work that needs to get done to the people that are available to do it. In my experience, it&rsquo;s neither top down nor bottom up, but a strange mixture of the two.<sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup></p>
<p>Because of the opportunistic nature of open-source development, some parts of a feature move more quickly than others &ndash; often, the basic design gets hammered out early, but implementation can take a long time. Sadly, the reference is often the hardest thing to catch up, in part because the rather heroic <a href="https://github.com/ehuss">Eric Huss</a> does not implement the <code>Clone</code> trait. 💜</p>
<h2 id="observation-polished-features-dont-stand-alone">Observation: Polished features don&rsquo;t stand alone</h2>
<p>And yet, to be <strong>truly polished</strong>, features need more than docs and error-messages: they need other features! It often happens that users using feature X will find that, to complete their task, they also need feature Y. This inevitably presents a challenge to our stabilization system, which judges the stability of each feature independently.</p>
<p>Async functions in trait are a great example: the core feature is working great on stable, but we haven&rsquo;t reached consensus on a solution to the <a href="https://smallcultfollowing.com/babysteps/blog/2023/02/01/async-trait-send-bounds-part-1-intro/">send bound problem</a>. For some users, like embedded users, this doesn&rsquo;t matter at all. For others, like Tower, this is a pretty big problem. So, do we hold back async function in traits until both features are ready? Or do we work incrementally, releasing what is ready <em>now</em> and then turning to focus on what&rsquo;s left?</p>
<p><img src="https://media.giphy.com/media/l4JyKQhSRBExNYzkc/giphy.gif" alt="&ldquo;We seem to have been designed for each other&rdquo; &ndash; Mr Collins."></p>
<h2 id="observation-nightly-is-just-the-beginning">Observation: Nightly is just the beginning</h2>
<p>I can hear readers saying now, &ldquo;but wait, isn&rsquo;t this what Nightly is for?&rdquo; And yes, in principle, the nightly release is our vehicle for enabling experimentation with in-progress features. Sometimes it works great! It can be a great way to get ahead of confusing error messages, for example, or to flush out bugs. But all too often, Nightly is a big barrier for people, particularly production Rust users or those building widely used libraries. And those are precisely the users whose feedback would be most valuable.</p>
<p>What&rsquo;s interesting is that many production users would be willing to tolerate a certain amount of instability. Many users tell me they wouldn&rsquo;t mind rebasing over small changes in the feature design<sup id="fnref:7"><a href="#fn:7" class="footnote-ref" role="doc-noteref">7</a></sup>, but what they can&rsquo;t tolerate is building a codebase around a feature and then having it removed entirely, or having dropped support for major use cases without some kind of workaround.</p>
<p>Libraries are another interesting story. Library authors tend to be more advanced than your typical Rust user. They can tolerate a lack of polish in exchange for having access to a feature that lets them build a nicer experience for their users. Generic associated types are a clear example of this. One of the big arguments in favor of stabilizing them was that they often show up in the <em>implementation</em> of libraries but not in the <em>outward interfaces</em>. As one personal example, we&rsquo;ve been using them extensively in <a href="https://duchess-rs.github.io/duchess/">Duchess</a>, an experimental library for Java-Rust interop, and yet you won&rsquo;t find any mention of them in the docs. Do we sometimes hit confusing errors or other problems? Yes. Is the syntax annoyingly verbose? Yes, absolutely. Am I glad they are stabilized? <strong>Hell yes.</strong></p>
<h2 id="observation-having-users-help-us-figure-out-what-else-is-needed">Observation: having users help us figure out what else is needed</h2>
<p>Remember how I said that it was hard to design quality diagnostics until you had seen the ways that users got confused? Well, the same goes for designing related features. Once production users or library authors start playing with something, they find all kinds of clever things they can do with it &ndash; or, often, things they could <em>almost</em> do, except for this one other missing piece. In this way, holding things unstable on Nightly &ndash; which means far fewer users can touch it &ndash; holds back the whole pace of Rust development significantly.</p>
<h2 id="prior-art">Prior art</h2>
<h3 id="embers-feature-lifecycle">Ember&rsquo;s feature lifecycle</h3>
<p>The Ember and Rust projects have long had a lot of fruitful back-and-forth when it comes to governance and process, thanks in part to the fact that Yehuda Katz was deeply involved in both of them. In 2022, they adopted a <a href="https://blog.emberjs.com/improved-rfc-process/">revised RFC process</a> in which each feature goes through a number of stages:<sup id="fnref:8"><a href="#fn:8" class="footnote-ref" role="doc-noteref">8</a></sup></p>
<ol start="0">
<li>Proposed &ndash; An open pull request on the emberjs/rfcs repo.</li>
<li>Exploring &ndash; An RFC deemed worth pursuing but in need of refinement.</li>
<li>Accepted &ndash; A fully specified RFC.</li>
<li>Ready for release &ndash; The implementation of the RFC is complete, including learning materials.</li>
<li>Released &ndash; The work is published.</li>
<li>Recommended &ndash; The feature/resource is recommended for general use.</li>
</ol>
<p>This is pretty cool! One other interesting aspect for Ember is how they approach <a href="https://emberjs.com/editions/">editions</a>. Remember I talked about how features don&rsquo;t stand alone? In Ember, a significant cluster of related features is called an &ldquo;edition&rdquo;. New editions are declaed when all the pieces are in place to enable a new model for programming. This is pretty distinct from Rust&rsquo;s time-based editions.</p>
<p>I&rsquo;m not totally sure how to map Ember&rsquo;s edition to Rust, but I think that the concept of an &ldquo;umbrella initiative&rdquo; is pretty close. For example, the <a href="https://rust-lang.github.io/async-fundamentals-initiative/roadmap.html">async fundamentals initaitive roadmap</a> identifies a cluster of related work that together constitute &ldquo;async-sync language parity&rdquo; &ndash; i.e., you can truly use async operations everywhere you would like to.</p>
<p>One interesting aspect of Ember&rsquo;s editions is that they often begin by stabilizing &ldquo;primitives&rdquo; &ndash; e.g., fundamental APIs that aren&rsquo;t really meant for end-users, but rather for plugin authors or people in the ecosystem, who can use them to experiment with the right end-user abstractions. I&rsquo;ve found in Rust that we sometimes do this, though sometimes we find it better to begin with the end-user abstraction, and expose the primitives later.</p>
<h3 id="the-tc39-process-for-ecmascript">The TC39 process for ECMAScript</h3>
<p>The TC39 committee has a <a href="https://github.com/tc39/how-we-work/blob/main/champion.md">nice staged process</a>. It&rsquo;s not exactly comparable to Rust, but there are few things worth observing. First, I love the designation of a <em>champion</em> for a feature, and I think Rust would benefit from being more official about that in some ways. Second, I also love the <a href="https://github.com/tc39/how-we-work/blob/main/explainer.md">explainer</a> concept of authoring user documentation as part of the process. Third, before they stabilize, they always make the feature available to end-users, but under gates.</p>
<h3 id="javas-preview-features">Java&rsquo;s preview features</h3>
<p>Ever since [JEP-12], Java has included <strong>preview features</strong> in their release process. A preview feature is one that is &ldquo;fully specified, fully implemented, and yet impermanent&rdquo; &ndash; it&rsquo;s released for feedback, but it may be removed or changed based on the result of the evaluation. The motivation is to get more feedback on the design before committing to it:</p>
<blockquote>
<p>To build confidence in the correctness and completeness of a new feature &ndash; whether in the Java language, the JVM, or the Java SE API &ndash; it is desirable for the feature to enjoy a period of broad exposure after its specification and implementation are stable but before it achieves final and permanent status in the Java SE Platform.</p>
</blockquote>
<p>When using preview features, users <em>opt-in</em> both at compilation time <em>and</em> at runtime. In other words, if you compile a Java file that uses preview features to a JAR, and distribute the JAR, people using the JAR must also opt-in.</p>
<h2 id="proposal">Proposal</h2>
<p>Instead of rehashing the same debate every time we go to stabilize a feature, I think we should look at our feature release process so that we have more <em>gradations</em> of stability:</p>
<ul>
<li><strong>accepted RFC</strong> &ndash; With an accepted RFC, the team has agreed that we want the feature in principal. However, the details often change during development, and may even be removed. Use at your own risk.</li>
<li><strong>preview</strong> &ndash; We are commited to keeping this functionality in some form, but we reserve the right to make changes. We won&rsquo;t remove functionality from preview state without some kind of workaround. You can use this feature so long as you are willing to update your code when moving to a new version of the compiler. <strong>Preview features must be viral</strong>, meaning that if I build a crate using preview features, consumers must opt-in to the resulting instability <em>somehow</em>.</li>
<li><strong>semver stable</strong> &ndash; We have committed to the technical design of this feature and people can build on it without fear of breakage between compiler revisions. The experience may lack polish and some intended use cases may not yet be possible.</li>
<li><strong>recommended for use</strong> &ndash; This feature has all the documentation, error messages, and associated features that are needed for most Rust users to be successful. USE IT!</li>
</ul>
<p><strong>Comparison with today&rsquo;s release trains.</strong> In our system today, the first three phases are both covered by &ldquo;nightly&rdquo; and the latter two are both covered by &ldquo;stable&rdquo;, but of course we don&rsquo;t draw any formal distinctions. Async function in trait, for example, is clearly past the <strong>accepted RFC</strong> phase and is now in <strong>preview</strong>: the team is committed to shipping it in some form, and we don&rsquo;t expect any major changes. But how would you know this, if you aren&rsquo;t closely following Rust development? Generic associated types, meanwhile, are clearly <strong>semver stable</strong> rather than <strong>recommended for use</strong> &ndash; we know of many major gaps in the experience, mostly blocked on the <a href="https://github.com/rust-lang/trait-system-refactor-initiative/">trait system refactor initiative</a>, but how would you know <em>that</em>, unless you were actively attending Rust types team meetings?</p>
<h2 id="unresolved-questions">Unresolved questions</h2>
<p>I am confident that these four phases are important, but there are a number of details of which I am <em>not</em> sure. Let me pose some of the questions I anticipate here.</p>
<h3 id="how-committed-should-we-be-to-preview-features">How committed should we be to preview features?</h3>
<p>In my proposal above, I said that the project would not remove functionality without a workaround. This is somewhat stronger than [JEP-12][], which indicates that preview features &ldquo;will either be granted final and permanent status (with or without refinements) or be removed&rdquo;. I said something somewhat stronger because I was thinking of production users. I know many such users would happily make use of preview features, and they are willing to make updates, but they don&rsquo;t want to get stuck having based their codebase on something that completely goes away. I feel pretty confident that by the time we get to preview state, we should be able to say &ldquo;yes, we want <em>something</em> like this&rdquo;. I think it&rsquo;s fine however if the feature gets removed in favor, say, of a procedural macro or some other solution, so long as the people using that preview feature has somewhere to go. (Naturally, my preference would be to provide as smooth a path as possible between compiler revisions; ideally, we&rsquo;d issue automatable suggestions using <a href="https://doc.rust-lang.org/cargo/commands/cargo-fix.html">cargo fix</a>, similar to what we do for editions.)</p>
<h3 id="how-should-the-features-be-reflected-in-our-release-trains">How should the features be reflected in our release trains?</h3>
<p>I don&rsquo;t entirely know! I think there are a lot of different versions. I do know a few things:</p>
<ul>
<li><strong>Instability should be viral, whether experimental or preview:</strong> today, if I depend on a crate that uses nightly features, I must use nightly myself; this falls out from the fact that Rust doesn&rsquo;t support binary distribution, but is very much intentional. The reason is that a crate cannot truly &ldquo;hide&rdquo; instability from its users. They can always upgrade to a new version of Rust and, if that causes the crate to stop compiling, they will perceive this as a failure of Rust&rsquo;s promise, even it is a result of the crate having used an unstable feature. We need to do the same kind of viral result for preview features.</li>
<li><strong>Preview and stabilized features need to be internally consistent, but not complete or fully polished:</strong> Preview features need to meet a certain quality bar &ndash; e.g., support in rustfmt, adequate documentation &ndash; but it&rsquo;s fine for them to be a subset of what we hope to do in the fullness of time. It&rsquo;s also ok for them to have less-than-ideal error messages. Those things come with time.</li>
<li><strong>Documentation is key:</strong> A big challenge for Rust today is that we don&rsquo;t have a canonical way for people to find out the status of the things they care about. I think we should invest some effort in setting up a consistent format with bot/tooling support to make it easy to maintain. Users will understand the idea that a feature is unpolished <em>if</em> you can direct them to a page where they can understand the context and learn about the workarounds they need in the short term.</li>
</ul>
<p>With that in mind, here is a possible proposal for how we might do this:</p>
<ul>
<li>Initially, <strong>features are nightly only</strong>, as today, and require an individual feature-gate.
<ul>
<li>Until there is an accepted RFC, we should have a mandatory warning that the team has not yet decided if the feature is worth including; we also can continue to warn for features whose implementation is very incomplete.</li>
</ul>
</li>
<li><strong>Preview features</strong> are usable on <strong>stable</strong>, but with opt-in:
<ul>
<li>Every project that uses any preview features, or which depends on crates that use preview features, must include <code>preview-features = true</code> in their <code>Cargo.toml</code>.</li>
<li>Every crate that directly uses preview features must additionally include the appropriate feature gates.</li>
<li>Reaching preview status should require some base level of support
<ul>
<li>core tooling, e.g. rustfmt, rustdoc, must work</li>
<li>an explainer must be available, but Rust reference material is not required</li>
<li>a nice landing page (or Github issue with known format) that indicates how to provide feedback; this page should also cover polish or supporting features that are known to be missing (similar to the [async fn fundamentals roadmap][roadmap])</li>
<li>the feature must be &ldquo;complete enough&rdquo; to meet some of its intended use cases; it doesn&rsquo;t have to meet <em>all</em> of its intended use cases.</li>
</ul>
</li>
<li>This is an FCP decision, because it is commits the Rust project to supporting the use cases targeted by the preview feature (if not the details of how the feature works).</li>
</ul>
</li>
<li><strong>Semver stable features</strong> features are usable on stable, but we make efforts to redirect users to the landing page have a landing page that outlines what kind of support is still missing and how to provide feedback.
<ul>
<li>Reaching semver stable requires an update to the Rust reference, in addition to the requirements for preview.</li>
<li>The feature must be &ldquo;complete enough&rdquo; to meet some of its intended use cases; it doesn&rsquo;t have to meet <em>all</em> of its intended use cases.</li>
<li>This is an FCP decision, because it is commits the Rust project to supporting the feature in its current form going forward.</li>
</ul>
</li>
<li><strong>Recommended for use</strong> features would be just as today.
<ul>
<li>The feature must meet all of the major use cases, which may mean that other features are present.</li>
</ul>
</li>
</ul>
<h2 id="other-frequently-asked-questions-and-alternatives">Other frequently asked questions and alternatives</h2>
<p>Here are answers to a few other questions I anticipate.</p>
<h3 id="who-will-maintain-these-landing-pages">Who will maintain these &ldquo;landing pages&rdquo;?</h3>
<p>This is a good question! It&rsquo;s easy for these to get out of date. I think part of designing this &lsquo;preview&rsquo; process should also be investing in a standard template for the landing pages and some guidelines. My sense is that people would be happy to update landing pages as part of the stabilization process if it meant they can make progress on shipping the feature they&rsquo;ve worked so hard to build! But I think we can do a lot to make it easier. Having a standard format would also mean that users can find the information they&rsquo;re looking for more easily. We can then also build bots and things to help. I&rsquo;ve seen that investing in bots can make a real difference.</p>
<h3 id="how-will-we-ensure-polish-gets-done">How will we ensure polish gets done?</h3>
<p>One concern that is commonly raised is that stabilization is the only gate we have to force polish work to get done. I agree that we should maintain a certain quality bar as features move towards being fully recommended. But I think that saying &ldquo;we cannot ship something for widespread use until it is polished&rdquo; misses the point that open-source is incremental. In other words, part of the <em>way that features get polished</em> is by releasing them for widespread use.</p>
<p>Definitely though the Rust project can do a better job of tracking and ensuring that we do the follow-up items. There are plenty of examples of follow-up that never gets done. But I don&rsquo;t think blocking stabilization is an effective tool for that. If anything, it&rsquo;s demoralizing<sup id="fnref:9"><a href="#fn:9" class="footnote-ref" role="doc-noteref">9</a></sup>. We really need to strengthen our project management skills &ndash; pushing people to create better landing pages and to help identify the gaps more crisply feels like it can help, though more is needed.</p>
<h3 id="why-would-we-stabilize-a-feature-if-we-know-users-will-hit-gaps">Why would we stabilize a feature if we know users will hit gaps?</h3>
<p>Most features in Rust serve a lot of purposes. Even if we know about major gaps, there are often important blocks of users who are not affected by them. For async functions in traits, the <a href="https://smallcultfollowing.com/babysteps/blog/2023/02/01/async-trait-send-bounds-part-1-intro/">send bound problem</a> can be a total blocker &ndash; but it&rsquo;s also a non-issue for a lot of users. I would like to see us focus more on how we can alert users to the gaps they are hitting rather than denying them access to features until everything is done.</p>
<h3 id="i-thought-you-said-you-wanted-to-move-incrementally-this-feels-like-a-big-step">I thought you said you wanted to move incrementally? This feels like a big step.</h3>
<p>Earlier, I said that I wanted to look for incremental ways to tweak Rust&rsquo;s process, since in the past I&rsquo;ve gotten too ambitious. In truth, I think this blog post is really laying out <em>two</em> proposals, so let me separate them out:</p>
<ul>
<li>Part 1: Semver-stable vs recommended-for-use
<ul>
<li>The most immediate need is to clarify what stabilization means and what exactly is the &ldquo;bar&rdquo;; in my opinion, that is semver stability, and I think there is plenty of precedent for that.</li>
<li>But I think the risk of user confusion is very real, and we can take some simple steps to help mitigate it, such as creating good landing pages and having the compiler direct users to them when it thinks they may be encountering a gap.
<ul>
<li>Example: today if you try to use an async fn in a trait, you get directed to the <code>async-trait</code> crate. We can detect &ldquo;send bound&rdquo;-related failures and direct users to a github issue that explains how they can resolve it and also gives them a way to register interest or provide feedback.</li>
</ul>
</li>
<li>I don&rsquo;t think anything is really blocking us from moving forward here immediately, though an RFC might be nice at some point to clarify terminology and help align the way we talk about this.</li>
</ul>
</li>
<li>Part 2: Preview features
<ul>
<li>Preview features is really a distinct concept, but I do think it&rsquo;s important. For example, we could have declared async functions in traits as a &lsquo;preview feature&rsquo; over a year ago. This would have given us a lot more data and made it accessible to a much broader pool of people. I think this would have given us a clearer picture on how important the &lsquo;send bound&rsquo; problem is, for example, and would inform other prioritization efforts.</li>
<li>Moving forward here will require an RFC and also implementation work.</li>
</ul>
</li>
</ul>
<h2 id="conclusion">Conclusion</h2>
<p>With <a href="http://www.literaturepage.com/read/prideandprejudice-34.html">apologies to Jane Austen</a>:</p>
<blockquote>
<p>&ldquo;All Rust features are so accomplished. They all have stable semantics and even make helpful suggestions when you go astray. I am sure I never encountered a Rust feature without being informed that it was very accomplished.&rdquo;</p>
<p>&ldquo;Your list of the common extent of accomplishments,&rdquo; said Darcy, &ldquo;has too much truth. The word is applied to many a feature who deserves it no otherwise than by being stabilized. But I am very far from agreeing with you in your estimation of Rust features in general. I cannot boast of knowing more than half-a-dozen, in the whole range of my acquaintance, that are really accomplished.&rdquo;</p>
<p>&ldquo;Then,&rdquo; observed Elizabeth, &ldquo;you must comprehend a great deal in your idea of an accomplished feature.&rdquo;</p>
<p>&ldquo;Oh! certainly,&rdquo; cried his faithful assistant, &ldquo;no feature can be really esteemed accomplished without strong support in the IDE, wondorous documentation, and perhaps a chapter in the Rust book.&rdquo;</p>
<p>&ldquo;All this it must possess,&rdquo; added Darcy, &ldquo;and to all this it must yet add something more substantial: a host of related features that address common problems our users may encounter.&rdquo;</p>
<p>&ldquo;I am no longer surprised at your knowing ONLY six accomplished features. I rather wonder now at your knowing ANY.&rdquo;</p>
</blockquote>
<p>To translate: I think our &lsquo;all or nothing&rsquo; stability system is introducing unnecessary friction into Rust development. Let&rsquo;s change it!</p>
<hr>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>A critique which many people pointed out to me at the time.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>The whole &ldquo;How do I&hellip;&rdquo; section on the page has some interesting things, if you&rsquo;re looking to interact with the lang team!&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>The decision to limit in-tree experimentation to experienced contributors was based on our experience with the earlier initiative system, where we were more open-ended. We found that the majority of those projects never went anywhere. Most of the people who signed up to drive experiments didn&rsquo;t really have the time or knowledge to move them independently, and there wasn&rsquo;t enough mentoring bandwidth to help them make progress. So we decided to limit in-tree experimentation to maintainers who&rsquo;ve already demonstrated staying power.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p><a href="https://github.com/rust-lang/rfcs/blob/master/text/1122-language-semver.md">RFC 1122</a> lays out the lang team&rsquo;s definition of &ldquo;breaking change&rdquo;, which is not <em>quite</em> the same as &ldquo;your code will always continue to compile&rdquo;. For example, we sometimes change the rules of inference; we also introduce or modify the behavior of lints (which can cause code that has <code>#[deny]</code> to stop compiling). Finally, we reserve the right to fix soundness bugs. And, in rare cases, we will override the policy altogether, if a feature&rsquo;s design is so broken, but the bar for that is quite high.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>One of the things I am proud of about the Rust project is that we <em>are</em> willing to stop and revisit old decisions &ndash; I think we&rsquo;ve dodged a number of bullets that way. At the same time, it&rsquo;s exhausting. I think there&rsquo;s more to say about finding ways to enable conversation that are not as draining on the participants, and especially on the designers and maintainers, but that&rsquo;s a topic for another post.&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p>That said, my experience is that Amazon works in a surprisingly similar way &ndash; there are top-down decisions, but there are an awful lot of bottom-up ones. I imagine this varies company to company, but I think ultimately every good manager tries to ensure that their people are working on things that are well-suited to their skills.&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:7">
<p>Many of which could be automated via <a href="https://doc.rust-lang.org/cargo/commands/cargo-fix.html"><code>cargo fix</code></a>!&#160;<a href="#fnref:7" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:8">
<p>Speaking of Ember-Rust cross-polination, Peter Wagenet, co-author of the <a href="https://blog.emberjs.com/improved-rfc-process/">Ember release blog post</a>, also hacks on the Rust compiler from time to time.&#160;<a href="#fnref:8" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:9">
<p>There&rsquo;s nothing worse than investing months and months of work into getting something ready for stabilization, endlessly triaging issues, only to open a stabilization PR &ndash; the culmination of all that effort &ndash; and have the first few comments tell you that your work is not good enough. Oftentimes the people opening those PRs are volunteers, as well, which makes it all the worse.&#160;<a href="#fnref:9" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><title type="html">Higher-ranked projections (send bound problem, part 4)</title><link href="https://smallcultfollowing.com/babysteps/blog/2023/06/12/higher-ranked-projections-send-bound-problem-part-4/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2023/06/12/higher-ranked-projections-send-bound-problem-part-4/</id><published>2023-06-12T00:00:00+00:00</published><updated>2023-06-12T09:06:00-04:00</updated><content type="html"><![CDATA[<p>I recently <a href="https://rust-lang.zulipchat.com/#narrow/stream/187312-wg-async/topic/associated.20return.20types.20draft.20RFC">posted a draft of an RFC</a> about <a href="https://smallcultfollowing.com/babysteps/blog/2023/02/13/return-type-notation-send-bounds-part-2/">Return Type Notation</a> to the async working group Zulip stream. In response, Josh Triplett reached out to me to raise some concerns. Talking to him gave rise to a 3rd idea for how to resolve the send bound problem. I still prefer RTN, but I think this idea is interesting and worth elaborating. I call it <em>higher-ranked projections</em>.</p>
<h2 id="idea-part-1-define-tfoo-when-t-has-higher-ranked-bounds">Idea part 1: Define <code>T::Foo</code> when <code>T</code> has higher-ranked bounds</h2>
<p>Consider a trait like this…</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Transform</span><span class="o">&lt;</span><span class="n">In</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Output</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">apply</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="k">in</span>: <span class="nc">In</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Output</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Today, given a trait bound like <code>T: Transform&lt;Vec&lt;u32&gt;&gt;</code>, when you write <code>T::Output</code>, the compiler expands that to a fully qualified associated type <code>&lt;T as Transform&lt;Vec&lt;u32&gt;&gt;&gt;::Output</code>. This took a bit of work — the self type (<code>T</code>) of the trait is specified by the user, but the compiler looked at the bounds to select <code>Vec&lt;u32&gt;</code> as the value for <code>In</code>.</p>
<p>But suppose you have a higher-ranked trait bound like <code>T: for&lt;‘a&gt; Transform&lt;&amp;’a [u32]&gt;</code>. Then what should the compiler do for <code>T::Output</code>? The compiler would have to  something like <code>&lt;T as Transform&lt;&amp;’b str&gt;&gt;::Output</code> where we pick a specific lifetime <code>’b</code>. Instead of doing that, the compiler currently gives an error.</p>
<p>But we don’t always <em>need</em> to expand <code>T::Output</code> to a specific type. If <code>T::Output</code> is appearing in a <em>where-clause</em>, we could expand it to a random of types. For example, consider this function, which today will not compile:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">process</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">T</span>: <span class="nc">for</span><span class="o">&lt;</span><span class="err">‘</span><span class="n">a</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Transform</span><span class="o">&lt;&amp;</span><span class="err">’</span><span class="n">a</span><span class="w"> </span><span class="kt">str</span><span class="o">&gt;&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">T</span>::<span class="n">Output</span>: <span class="nb">Send</span><span class="p">,</span><span class="w"> </span><span class="c1">// ERROR: `T::Output` is not allowed
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w"> </span><span class="cm">/* … */</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>We could interpret <code>T::Output: Send</code> as a higher-ranked bound, for example:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">process</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">T</span>: <span class="nc">for</span><span class="o">&lt;</span><span class="err">‘</span><span class="n">a</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Transform</span><span class="o">&lt;&amp;</span><span class="err">’</span><span class="n">a</span><span class="w"> </span><span class="kt">str</span><span class="o">&gt;&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="o">&lt;</span><span class="err">‘</span><span class="n">a</span><span class="o">&gt;</span><span class="w"> </span><span class="o">&lt;</span><span class="n">T</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="n">Transform</span><span class="o">&lt;&amp;</span><span class="err">’</span><span class="n">a</span><span class="w"> </span><span class="kt">str</span><span class="o">&gt;&gt;</span>::<span class="n">Output</span>: <span class="nb">Send</span><span class="p">,</span><span class="w"> </span><span class="c1">// Desugared?
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w"> </span><span class="cm">/* … */</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="idea-part-2-fix-the-bugs-on-associated-type-chains">Idea part 2: Fix the bugs on associated type chains</h2>
<p>Right now, if have an iterator that yields other items, the compiler won’t let you write things like <code>T::Item::Item</code>…</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Iterator</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">T</span>::<span class="n">Item</span>: <span class="nb">Iterator</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">T</span>::<span class="n">Item</span>::<span class="n">Item</span>: <span class="nb">Send</span><span class="p">,</span><span class="w"> </span><span class="c1">// &lt;— ERROR
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w"> </span><span class="cm">/* … */</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>…instead you have to write something horrible like <code>&lt;&lt;T as Iterator&gt;::Item as Iterator&gt;::Item</code>. There’s no particularly good reason for this. We should make it work better. One thing that would be useful is if we examined the bounds declared in the trait, so that e.g. if we have a trait like…</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Factory</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nb">Iterator</span>: <span class="nb">Iterator</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>…and a <code>F: Factory</code>, then <code>F::Iterator::Item</code> should work.</p>
<h2 id="idea-part-3-associated-type-for-every-method-in-a-trait">Idea part 3: Associated type for every method in a trait</h2>
<p>As the final step, for every method in a trait, we could add an associated type that binds to the “zero-sized function type” associated with that method. So in the <code>Iterator</code> trait…</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>…there’d be two associated types, <code>Item</code> and <code>next</code>. Given <code>T: Iterator</code>, <code>T::next</code> would map to a function type that implements <code>for&lt;‘a&gt; Fn(&amp;’a mut T) -&gt; Option&lt;T::Item&gt;</code>.</p>
<h2 id="putting-it-all-together">Putting it all together</h2>
<p>If we put this all together, we can start to put bounds in the return types of async functions. Consider our usual trait:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">HealthCheck</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">check</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>and then a function like</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">spawn_health_check</span><span class="o">&lt;</span><span class="no">HC</span><span class="o">&gt;</span><span class="p">(</span><span class="n">hc</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="no">HC</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="no">HC</span>: <span class="nc">HealthCheck</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="no">HC</span>::<span class="n">check</span>::<span class="n">Output</span>: <span class="nb">Send</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="cm">/* … */</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>what does <code>HC::check::Output: Send</code> mean? Note that the <code>Output</code> here is the return type of the <em>function</em> trait, so it refers to the future that you get when you call the async function.</p>
<p>Regardless, by combining ideas part 1, 2, and 3, <code>HC::check::Output</code> can then be expanded to the following:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">spawn_health_check</span><span class="o">&lt;</span><span class="no">HC</span><span class="o">&gt;</span><span class="p">(</span><span class="n">hc</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="no">HC</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="no">HC</span>: <span class="nc">HealthCheck</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// `HC::check::Output: Send` becomes…
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="o">&lt;</span><span class="err">‘</span><span class="n">a</span><span class="o">&gt;</span><span class="w"> </span><span class="o">&lt;</span><span class="no">HC</span>::<span class="n">check</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="nb">Fn</span><span class="o">&lt;</span><span class="p">(</span><span class="o">&amp;</span><span class="err">’</span><span class="n">a</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="no">HC</span><span class="p">,)</span><span class="o">&gt;&gt;</span>::<span class="n">Output</span>: <span class="nb">Send</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="cm">/* … */</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>which, if you really like complex where clauses, you could further expand to this to a where-clause like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">for</span><span class="o">&lt;</span><span class="err">‘</span><span class="n">a</span><span class="o">&gt;</span><span class="w"> </span><span class="o">&lt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">&lt;</span><span class="no">HC</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="n">HealthCheck</span><span class="o">&gt;</span>::<span class="n">check</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">as</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nb">Fn</span><span class="o">&lt;</span><span class="p">(</span><span class="o">&amp;</span><span class="err">’</span><span class="n">a</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="no">HC</span><span class="p">,)</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="o">&gt;</span>::<span class="n">Output</span>: <span class="nb">Send</span>
</span></span></code></pre></div><h2 id="comparing-this-approach-and-rtn">Comparing this approach and RTN</h2>
<p>In many ways, this idea is very similar to RTN. Compare this example…</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">spawn_health_check</span><span class="o">&lt;</span><span class="no">HC</span><span class="o">&gt;</span><span class="p">(</span><span class="n">hc</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="no">HC</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="no">HC</span>: <span class="nc">HealthCheck</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="no">HC</span>::<span class="n">check</span>::<span class="n">Output</span>: <span class="nb">Send</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="cm">/* … */</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>…to the RTN-based approach…</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">spawn_health_check</span><span class="o">&lt;</span><span class="no">HC</span><span class="o">&gt;</span><span class="p">(</span><span class="n">hc</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="no">HC</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="no">HC</span>: <span class="nc">HealthCheck</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="no">HC</span>::<span class="n">check</span><span class="p">()</span>: <span class="nb">Send</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="cm">/* … */</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>In fact, <code>()</code> could be a shorthand for <code>::Output</code>.</p>
<h2 id="associated-type-bounds">Associated type bounds</h2>
<p>Another part of RTN, and in fact the only part that we’ve implemented so far, is the ability to put bounds on function returns “inline”:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">spawn_health_check</span><span class="o">&lt;</span><span class="no">HC</span><span class="o">&gt;</span><span class="p">(</span><span class="n">hc</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="no">HC</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="no">HC</span>: <span class="nc">HealthCheck</span><span class="o">&lt;</span><span class="n">check</span><span class="p">()</span>: <span class="nb">Send</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//             ———
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="cm">/* … */</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>We could in principle do the same thing with <code>::Output</code> notation:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">spawn_health_check</span><span class="o">&lt;</span><span class="no">HC</span><span class="o">&gt;</span><span class="p">(</span><span class="n">hc</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="no">HC</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="no">HC</span>: <span class="nc">HealthCheck</span><span class="o">&lt;</span><span class="n">check</span>::<span class="n">Output</span>: <span class="nb">Send</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//             ———
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="cm">/* … */</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="pro-simpler-building-blocks">Pro: simpler building blocks</h2>
<p>What I really like about this idea is that it doesn’t introduce new concepts or notation, but rather refines and extends ones that exist. We already have <code>T::Output</code> — all this is doing is making it work in contexts where it didn’t work before, and in a fairly logical way. We already have zero-sized function types representing every method, but now we would have a way to name them.</p>
<h2 id="con-rust-has-two-namespaces-and-this-is-at-odds-with-that">Con: Rust has two namespaces, and this is at odds with that</h2>
<p>I said that we can add an associated type for every method in the trait — but what do we do if there is an associated type and a method with the same name? Something like this…</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">process</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">process</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>…that would be weird, but it can certainly happen (in fact, I’ve written proc macros that generate code like this because I was too lazy to transform the name of the associated type).</p>
<p>We have some options here. We could say that we only add associated types for a method if there isn’t an explicit associated type. We can make this shadowing illegal in Rust 2024 (but not earlier Rust editions). We can only add methods for async functions and RPITIT functions, which are not currently possible, and then forbid shadowing in those cases.</p>
<p>Still, fundamentally, this approach is of making a method into an associated type is at odds with Rust’s primary two namespaces (types, values), whereas the RTN approach is working <em>with</em> those two namespaces.</p>
<h2 id="con-omg-so-verbose-and-so-many-colons">Con: omg so verbose; and so. many. colons.</h2>
<p>The obvious downside of the <code>::Output</code> notation is that it is significantly more verbose to read and write when compared to RTN, and it puts <code>::</code> and <code>:</code> in close proximity (admittedly an existing problem with Rust syntax). Consider:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">where</span><span class="w"> </span><span class="no">HC</span>::<span class="n">check</span><span class="p">()</span>: <span class="nb">Send</span>
</span></span><span class="line"><span class="cl"><span class="c1">// vs
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w"> </span><span class="no">HC</span>::<span class="n">check</span>::<span class="n">Output</span>: <span class="nb">Send</span>
</span></span></code></pre></div><p>RTN also works really well in associated type bound position, but <code>::Output</code> works less well:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">where</span><span class="w"> </span><span class="no">HC</span>: <span class="nc">HealthCheck</span><span class="o">&lt;</span><span class="n">check</span><span class="p">()</span>: <span class="nb">Send</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// vs
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w"> </span><span class="no">HC</span>: <span class="nc">HealthCheck</span><span class="o">&lt;</span><span class="n">check</span>::<span class="n">Output</span>: <span class="nb">Send</span><span class="o">&gt;</span><span class="w">
</span></span></span></code></pre></div><h3 id="but">but…</h3>
<p>…although it must be said that, in practice, <code>check(): Send</code> isn’t the only thing you have to write. For example, this example only says that the future returned by <code>check()</code> is <code>Send</code>, but in practice you actually need <code>HC</code> to be <code>Send + ‘static</code> too. So you would have to write something like…</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="no">HC</span>: <span class="nc">HealthCheck</span><span class="o">&lt;</span><span class="n">check</span><span class="p">()</span>: <span class="nb">Send</span><span class="o">&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Send</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="err">‘</span><span class="k">static</span><span class="w">
</span></span></span></code></pre></div><p>…and, of course, many traits in practice have a lot more than one method. Consider something like this trait…</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Resource</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">get</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">put</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>…then you would need to write…</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">R</span>: <span class="nc">Resource</span><span class="o">&lt;</span><span class="n">get</span><span class="p">()</span>: <span class="nb">Send</span><span class="p">,</span><span class="w"> </span><span class="n">put</span><span class="p">()</span>: <span class="nb">Send</span><span class="o">&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Send</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="err">‘</span><span class="k">static</span><span class="w">
</span></span></span></code></pre></div><p>…and that quickly gets tedious. We encountered this in the case studies that we did, which is why the Google folks created a crate that lets you define a trait alias like <code>SendResource</code>, so that <code>R: SendResource</code> says all the above.</p>
<h2 id="con-confusion-between-output">Con: confusion between <code>Output</code></h2>
<p>One interesting point that Yosh raised in our lang team design meeting is that people already have the potential to be confused about whether the <code>Send</code> bound applies to the <em>future returned by the async function</em> or the <em>value you get from awaiting the future</em>; the fact that both <code>FnOnce</code> and <code>Future</code> have an <code>Output</code> associated type could well play into that confusion.</p>
<p>One thing we discussed is how one would place bounds on the <em>value returned from a future</em> (versus the future itself). Under the higher-ranked projections proposal described in this blog post, this is fairly clear, you just do <code>...::Output::Output</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">where</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">T</span>::<span class="n">method</span>::<span class="n">Output</span>::<span class="n">Output</span>: <span class="nb">Send</span>
</span></span><span class="line"><span class="cl">    <span class="c1">//         ------  ------
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//           |       |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//           |     Describes value produced by future
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//         Describes the future itself.
</span></span></span></code></pre></div><p>For RTN, there are multiple options. One is to use <code>::Output</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">where</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">T</span>::<span class="n">method</span><span class="p">()</span>::<span class="n">Output</span>: <span class="nb">Send</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//       --  ------
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//       |    |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//       |  Describes value produced by future
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//       Describes the future itself.
</span></span></span></code></pre></div><p>Another is to &ldquo;double down&rdquo; on the &ldquo;pseudo-expression&rdquo; syntax:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">where</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">T</span>::<span class="n">method</span><span class="p">().</span><span class="k">await</span>: <span class="nb">Send</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//       -- -----
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//       |    |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//       |  Describes value produced by future
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//       Describes the future itself.
</span></span></span></code></pre></div><p>We don&rsquo;t have to settle this today, but it&rsquo;s interesting to think about.</p>
<h2 id="pro-building-blocks-first">Pro: Building blocks first?</h2>
<p>I’m torn on this point. Lately I’ve been into the idea of “stabilize the building blocks”. For a mature language like Rust, it is important to work piece by piece. Moreover, thanks to custom derive and procedural macros, people can build really powerful abstractions if they have the buildings blocks to work with. And it’s sometimes a lot easier to get consensus around the building blocks than the nice syntax on top<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>. All of this argues to me for the <code>::Output</code> approach, which feels to me like more of a general purpose building block.</p>
<h3 id="but-1">but…</h3>
<p>On the other hand, the <code>()</code> syntax is itself a building block. But it’s a building block that’s actually nice enough to use in simple cases. We’ve often been reluctant to add new bits of syntax to Rust, and I think that’s generally good, but sometimes I look with envy at other languages that are willing to take bold steps to build designs that are <em>aggressively awesome</em>. I’d like us as a language community to <a href="http://smallcultfollowing.com/babysteps/blog/2022/02/09/dare-to-ask-for-more-rust2024/">dare to ask for more</a>. It’s hard to argue that the <code>::Output</code> syntax is aggressively awesome. The <code>()</code> syntax may not be <em>aggressively</em> awesome (that&rsquo;s probably <a href="http://smallcultfollowing.com/babysteps/blog/2023/03/03/trait-transformers-send-bounds-part-3/">trait transformers</a>), but it&rsquo;s at least mildly awesome.</p>
<h2 id="implementation-notes">Implementation notes</h2>
<p>Right now, the only form of RTN that we have <em>implemented</em> is the “associated type bound” notation, e.g., <code>HealthCheck&lt;check(): Send&gt;</code>. If we add RTN, I think we should also support use in where clauses (e.g., <code>HC::check(): Send</code>) and as a type for local variables (e.g., <code>let x: HC::check() = hc.check(…)</code>), persuant to the <a href="http://smallcultfollowing.com/babysteps/blog/2022/09/22/rust-2024-the-year-of-everywhere/">“year of everywhere”</a> philosophy, where we try to make Rust notations as uniformly applicable as possible<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>. That said, implementing it in those other places is significantly more complicated in the compiler.</p>
<p>The <code>::Output</code> notation, in contrast, doesn’t read especially well as an associated type bound (<code>HealthCheck&lt;check::Output: Send&gt;</code> is kind of O_O to me). I think it works better as a standalone where clause like <code>HC::check::Output: Send</code>. It’s not clear how quickly we can implement that. It should be possible, imo, but it requires more investigation.</p>
<h2 id="conclusion">Conclusion</h2>
<p>There isn’t one yet. My sense is that both the <code>::Output</code> and the RTN approach would work. The <code>::Output</code> approach feels a bit more “primitive”. It can be used with any higher-ranked trait bound, which means it covers slightly more options, although I don&rsquo;t have a compelling example of where you would want it right now. In contrast, RTN feels easier to explain and more accessible to newcomers, and it respects Rust’s “two namespaces” approach. Neither feels like a one-way door: we can start with RTN and then add <code>::Output</code> (in which case, <code>()</code> is a kind of sugar for <code>::Output</code>), and we can start with <code>::Output</code> and then add <code>()</code> as a sugar for it later.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Although not always! I think that <code>-&gt; impl Trait</code> is a good example of where stabilizing the syntax first, and working through the semantics and core primitives over time, has paid off.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Hat tip to TC for bringing up this slogan in the lang team meeting.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><title type="html">Giving, lending, and async closures</title><link href="https://smallcultfollowing.com/babysteps/blog/2023/05/09/giving-lending-and-async-closures/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2023/05/09/giving-lending-and-async-closures/</id><published>2023-05-09T00:00:00+00:00</published><updated>2023-05-09T11:13:00-04:00</updated><content type="html"><![CDATA[<p>In <a href="https://smallcultfollowing.com/babysteps/blog/2023/03/29/thoughts-on-async-closures/">a previous post on async closures</a>, I <a href="https://smallcultfollowing.com/babysteps/blog/2023/03/29/thoughts-on-async-closures/#conclusion">concluded</a> that the best way to support async closures was with an <code>async</code> trait combinator. I&rsquo;ve had a few conversations since the post and I want to share some additional thoughts. In particular, this post dives into what it would take to make async functions matchable with a type like <code>impl FnMut() -&gt; impl Future&lt;Output = bool&gt;</code>. This takes us down some interesting roads, in particular the distinction between giving and lending traits; it turns out that the closure traits specifically are a bit of a special case in turns of what we can do backwards compatibly, due to their special syntax. on!</p>
<h2 id="goal">Goal</h2>
<p>Let me cut to the chase. This article lays out a way that we <em>could</em> support a notation like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">take_closure</span><span class="p">(</span><span class="n">x</span>: <span class="nc">impl</span><span class="w"> </span><span class="nb">FnMut</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">impl</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kt">bool</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>It requires some changes to the <code>FnMut</code> trait which, somewhat surprisingly, are backwards compatible I believe. It also requires us to change how we interpret <code>-&gt; impl Trait</code> when in a trait bound (and likely in the value of an associated type); this could be done (over an Edition if necessary) but it introduces some further questions without clear answers.</p>
<p>This blog post itself isn&rsquo;t a real proposal, but it&rsquo;s a useful ingredient to use when discussing the right shape for async closures.</p>
<h2 id="giving-traits">Giving traits</h2>
<p>The split between <code>Fn</code> and <code>async Fn</code> turns out to be one instance of a general pattern, which I call &ldquo;giving&rdquo; vs &ldquo;lending&rdquo; traits. In a <em>giving</em> trait, when you invoke its methods, you get back a value that is independent from <code>self</code>.</p>
<p>Let&rsquo;s see an example. The current <code>Iterator</code> trait is a <em>giving</em> trait:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//      ^ the lifetime of this reference
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//        does not appear in the return type;
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//        hence &#34;giving&#34;
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>In <code>Iterator</code>, each time you invoke <code>next</code>, you get ownership of a <code>Self::Item</code> value (or <code>None</code>). This value is not borrowed from the iterator.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> As a consumer, a giving trait is convenient, because it permits you to invoke <code>next</code> multiple times and keep using the return value afterwards. For example, this function compiles and works for any iterator (<a href="https://play.rust-lang.org/?version=stable&amp;mode=debug&amp;edition=2021&amp;gist=baf23fe5cc1c5182acb3c9760b85ed33">playground</a>):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">take_two_v1</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Iterator</span><span class="o">&gt;</span><span class="p">(</span><span class="n">t</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">T</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="p">(</span><span class="n">T</span>::<span class="n">Item</span><span class="p">,</span><span class="w"> </span><span class="n">T</span>::<span class="n">Item</span><span class="p">)</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">i</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">t</span><span class="p">.</span><span class="n">next</span><span class="p">()</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="k">return</span><span class="w"> </span><span class="nb">None</span><span class="w"> </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">j</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">t</span><span class="p">.</span><span class="n">next</span><span class="p">()</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="k">return</span><span class="w"> </span><span class="nb">None</span><span class="w"> </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// *Key point:* `i` is still live here, even though we called `next`
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// again to get `j`.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nb">Some</span><span class="p">((</span><span class="n">i</span><span class="p">,</span><span class="w"> </span><span class="n">j</span><span class="p">))</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="lending-traits">Lending traits</h2>
<p>Whereas a <em>giving</em> trait gives you ownership of the return value, a <em>lending</em> trait is one that returns a value borrowed from <code>self</code>. This pattern is less common, but it certainly appears from time to time. Consider the <a href="https://doc.rust-lang.org/std/convert/trait.AsMut.html"><code>AsMut</code></a> trait:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="nb">AsMut</span><span class="o">&lt;</span><span class="n">T</span>: <span class="o">?</span><span class="nb">Sized</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">as_mut</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">T</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//        -             -
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Returns a reference borrowed from `self`.
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p><code>AsMut</code> takes an <code>&amp;mut self</code> and (thanks to Rust&rsquo;s <a href="https://doc.rust-lang.org/book/ch10-03-lifetime-syntax.html#lifetime-elision">elision rules</a>) returns an <code>&amp;mut T</code> borrowed from it. As a caller, this means that so long as you use the return value, the <code>self</code> is considered borrowed. Unlike with <code>Iterator</code>, therefore, you can&rsquo;t invoke <code>as_mut</code> twice and keep using both return values (<a href="https://play.rust-lang.org/?version=stable&amp;mode=debug&amp;edition=2021&amp;gist=dc6b05db0d60fea3a6bab9ae622b6350">playground</a>):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">as_mut_two</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">AsMut</span><span class="o">&lt;</span><span class="nb">String</span><span class="o">&gt;&gt;</span><span class="p">(</span><span class="n">t</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">t</span><span class="p">.</span><span class="n">as_mut</span><span class="p">();</span><span class="w"> </span><span class="c1">// Borrows `t` mutably
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">j</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">t</span><span class="p">.</span><span class="n">as_mut</span><span class="p">();</span><span class="w"> </span><span class="c1">// Error: second mutable borrow
</span></span></span><span class="line"><span class="cl"><span class="w">                        </span><span class="c1">// while the first is still live
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">i</span><span class="p">.</span><span class="n">len</span><span class="p">();</span><span class="w">            </span><span class="c1">// Use result from first borrow
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="lending-iterators">Lending iterators</h2>
<p>Of course, <code>AsMut</code> is kind of a &ldquo;trivial&rdquo; lending trait. A more interesting one is lending <em>iterators</em><sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>. A lending iterator is an iterator that returns references into the iterator self. Typically this is because the iterator has some kind of internal buffer that it uses. Until recently, there was no lending iterator trait because it wasn&rsquo;t even possible to express it in Rust. But with generic associated types (GATs), that changed. It&rsquo;s now possible to express the trait, although there are <a href="https://github.com/rust-lang/rust/issues/92985">borrow checker limitations</a> that block it from being practical<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">LendingIterator</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="o">&lt;</span><span class="na">&#39;this</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">Self</span>: <span class="na">&#39;this</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&lt;</span><span class="nb">&#39;_</span><span class="o">&gt;&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//      ^                        ^^
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Unlike `Iterator`, returns a value
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// potentially borrowed from `self`.
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>As the name suggests, when you use a lending iterator, it is <em>lending</em> values to you; you have to &ldquo;give them back&rdquo; (stop using them) before you can invoke <code>next</code> again. This gives more freedom to the iterator: it has the ability to use an internal mutable buffer, for example. But it takes some flexibility from you as the consumer. For example, the <code>take_two</code> function we saw earlier will not compile with <code>LendingIterator</code> (<a href="https://play.rust-lang.org/?version=stable&amp;mode=debug&amp;edition=2021&amp;gist=3b6c02ae459ccb917a994f8025a45496">playground</a>):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">take_two_v2</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">LendingIterator</span><span class="o">&gt;</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">t</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">T</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="p">(</span><span class="n">T</span>::<span class="n">Item</span><span class="o">&lt;</span><span class="nb">&#39;_</span><span class="o">&gt;</span><span class="p">,</span><span class="w"> </span><span class="n">T</span>::<span class="n">Item</span><span class="o">&lt;</span><span class="nb">&#39;_</span><span class="o">&gt;</span><span class="p">)</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">i</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">t</span><span class="p">.</span><span class="n">next</span><span class="p">()</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="k">return</span><span class="w"> </span><span class="nb">None</span><span class="w"> </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">j</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">t</span><span class="p">.</span><span class="n">next</span><span class="p">()</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="k">return</span><span class="w"> </span><span class="nb">None</span><span class="w"> </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// *Key point:* `i` is still live here, even though we called `next`
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// again to get `j`.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nb">Some</span><span class="p">((</span><span class="n">i</span><span class="p">,</span><span class="w"> </span><span class="n">j</span><span class="p">))</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="an-aside-inherent-or-accidental-complexity">An aside: Inherent or accidental complexity?</h2>
<p>It seems kind of annoying that <code>Iterator</code> and <code>LendingIterator</code> are two distinct traits. In a GC&rsquo;d language, they wouldn&rsquo;t be. This is a good example of what makes using Rust more complex. On the other hand, it&rsquo;s worth asking, is this <em>inherent</em> or <em>accidental</em> complexity? The answer, I think, is &ldquo;it depends&rdquo;.</p>
<p>For example, I could certainly write an <code>Iterator</code> in Java that makes use of an internal buffer:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-java" data-lang="java"><span class="line"><span class="cl"><span class="kd">class</span> <span class="nc">Compute</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">implements</span><span class="w"> </span><span class="n">Iterator</span><span class="o">&lt;</span><span class="n">ByteBuffer</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">ByteBuffer</span><span class="w"> </span><span class="n">shared</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">new</span><span class="w"> </span><span class="n">ByteBuffer</span><span class="p">(</span><span class="n">256</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">ByteBuffer</span><span class="w"> </span><span class="nf">next</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">if</span><span class="w"> </span><span class="p">(</span><span class="n">mutateSharedBuffer</span><span class="p">())</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">return</span><span class="w"> </span><span class="n">shared</span><span class="p">.</span><span class="na">asReadOnlyBufer</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">return</span><span class="w"> </span><span class="kc">null</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">/// Mutates `shared` and return true if there is a new value.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">private</span><span class="w"> </span><span class="kt">boolean</span><span class="w"> </span><span class="nf">mutateSharedBuffer</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// ...</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Despite the fact that Java has no way to express the concept, this is most definitely a <em>lending iterator</em>. If I try to write a function that invokes <code>next</code> twice, the first value will simply not exist anymore:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-java" data-lang="java"><span class="line"><span class="cl"><span class="n">Compute</span><span class="w"> </span><span class="n">c</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">new</span><span class="w"> </span><span class="n">Compute</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">ByteBuffer</span><span class="w"> </span><span class="n">a</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">c</span><span class="p">.</span><span class="na">next</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">ByteBuffer</span><span class="w"> </span><span class="n">b</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">c</span><span class="p">.</span><span class="na">next</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kt">byte</span><span class="w"> </span><span class="n">a0</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">a</span><span class="p">.</span><span class="na">get</span><span class="p">();</span><span class="w"> </span><span class="c1">// a has been overwritten with b..</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kt">byte</span><span class="w"> </span><span class="n">b0</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">b</span><span class="p">.</span><span class="na">get</span><span class="p">();</span><span class="w"> </span><span class="c1">// ..so `a0 == b0` is always true.</span><span class="w">
</span></span></span></code></pre></div><p>In a case like this, Rust&rsquo;s distinctions are expressing <strong>inherent complexity</strong><sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup>. If you want to have a shared buffer that you reuse between calls, Java makes it easy to make mistakes. Rust&rsquo;s ownership rules force you to copy out data that you want to keep using, preventing bugs like the one above. Eventually people learn to adopt functional patterns or to clone data instead of sharing access to mutable state. But that requires time and experience, and the compiler and language isn&rsquo;t helping you do so (unless you use, say, Haskell or O&rsquo;Caml or some purely functional language). These kinds of patterns are a good example of why Rust code winds up having that &ldquo;if it compiles, it works&rdquo; feeling, and how the same machinery that guarantees memory safety also prevents logical bugs.</p>
<h2 id="iterator-as-a-special-case-of-lendingiterator"><code>Iterator</code> as a special case of <code>LendingIterator</code></h2>
<p>OK, so we saw that the <code>Iterator</code> and <code>LendingIterator</code> trait, while clearly related, express an important tradeoff. The <code>Iterator</code> trait declares up front that each <code>Item</code> is independent from the iterator, but the <code>LendingIterator</code> declares that the <code>Item&lt;'_&gt;</code> values returned may be borrowed from the iterator. This affects what fully generic code (like our <code>take_two</code> function) can do.</p>
<p>But note a careful hedge: I said that the <code>LendingIterator</code> trait declares that <code>Item&lt;'_&gt;</code> calues <strong>may</strong> be borrowed from the iterator. They don&rsquo;t <strong>have</strong> to be. In fact, every <code>Iterator</code> can be viewed as a <code>LendingIterator</code> (as you can see in this <a href="https://play.rust-lang.org/?version=stable&amp;mode=debug&amp;edition=2021&amp;gist=733200eccf9d8d221a589a7db6f3bc85">playground</a>), much like every <code>FnMut</code> (which takes an <code>&amp;mut self</code>) can be viewed as a <code>Fn</code> (which takes an <code>&amp;self</code>). Essentially an <code>Iterator</code> is &ldquo;just&rdquo; a <code>LendingIterator</code> that doesn&rsquo;t happen to make use of the <code>'a</code> argument when defining its <code>Item&lt;'a&gt;</code>.</p>
<p>It&rsquo;s also possible to write a version of <code>take_two</code> that uses <code>LendingIterator</code> but compiles (<a href="https://play.rust-lang.org/?version=stable&amp;mode=debug&amp;edition=2021&amp;gist=359a2c2dab5e9e8c277af73c046bd1f3">playground</a>)<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">take_two_v3</span><span class="o">&lt;</span><span class="n">T</span><span class="p">,</span><span class="w"> </span><span class="n">U</span><span class="o">&gt;</span><span class="p">(</span><span class="n">t</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">T</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="p">(</span><span class="n">U</span><span class="p">,</span><span class="w"> </span><span class="n">U</span><span class="p">)</span><span class="o">&gt;</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">T</span>: <span class="nc">for</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span><span class="w"> </span><span class="n">LendingIterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">U</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// ^^^^^^                             ^
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// No matter which `&#39;a` is used, result is always `U`,
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// which cannot reference `&#39;a` (after all, `&#39;a` is not
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// in scope when `U` is declared).
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">i</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">t</span><span class="p">.</span><span class="n">next</span><span class="p">()</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="k">return</span><span class="w"> </span><span class="nb">None</span><span class="w"> </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">j</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">t</span><span class="p">.</span><span class="n">next</span><span class="p">()</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="k">return</span><span class="w"> </span><span class="nb">None</span><span class="w"> </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nb">Some</span><span class="p">((</span><span class="n">i</span><span class="p">,</span><span class="w"> </span><span class="n">j</span><span class="p">))</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The key here is the where-clause. It says that <code>T::Item&lt;'a&gt;</code> is always equal to <code>U</code>, no matter what <code>'a</code> is. <strong>In other words, the item that is produced by this iterator is <em>never</em> borrowed from <code>self</code></strong> &ndash; if it were, then its type would include <code>'a</code> somewhere, as that is the lifetime of the reference to the iterator. As a result, <code>take_two</code> compiles successfully. Of course, it also can&rsquo;t be used with <code>LendingIterator</code> values that actually make use of the flexibility the trait is offering them.</p>
<h2 id="can-we-unify-iterator-and-lendingiterator">Can we &ldquo;unify&rdquo; <code>Iterator</code> and <code>LendingIterator</code>?</h2>
<p>The fact that every iterator is just a special case of lending iterator begs the question, can they be unified? Jack Huey, in the runup to GATs, spend a while exploring this question, and concluded that it doesn&rsquo;t work. To see why, imagine that we changed <code>Iterator</code> so that it had <code>type Item&lt;'a&gt;</code>, instead of just <code>type Item</code>. It&rsquo;s easy enough to imagine that existing code that says <code>T: Iterator&lt;Item = u32&gt;</code> could be reinterpreted as <code>for&lt;'a&gt; T: Iterator&lt;Item&lt;'a&gt; = u32&gt;</code>, and then it ought to continue compiling. But the scheme doesn&rsquo;t quite work precisely because of examples like <code>take_two_v1</code>:</p>
<pre tabindex="0"><code class="language-rust=" data-lang="rust=">fn take_two_v1&lt;T: Iterator&gt;(t: &amp;mut T) -&gt; Option&lt;(T::Item, T::Item)&gt; {...}
</code></pre><p>This signature just says that it takes an <code>Iterator</code>; it doesn&rsquo;t put any additional constraints on it. If we&rsquo;ve modified <code>Iterator</code> to be a lending iterator, then you can&rsquo;t take two items independently. So we would have to have <strong>some</strong> way to say &ldquo;any giving iterator&rdquo; vs &ldquo;any lending iterator&rdquo; &ndash; and if we&rsquo;re going to say those two things, why not make it two distinct traits?</p>
<h2 id="fnmut-is-a-giving-trait"><code>FnMut</code> is a giving trait</h2>
<p>I started off this post talking about async closures, but so far I&rsquo;ve just talked about iterators. What&rsquo;s the connection? Well, for starters, the distinction between sync and async closures is precisely the difference between <em>giving</em> and <em>lending</em> closures.</p>
<p><strong>Sync</strong> closures (at least as defined now) are <strong>giving</strong> traits. Consider a (simplified) view of the <code>FnMut</code> trait as an example:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="nb">FnMut</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Output</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">call</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">args</span>: <span class="nc">A</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Output</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//      ^                      ^^^^^^^^^^^^
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// The `self` reference is independent from the
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// return type.
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p><code>FnMut</code> returns a <code>Self::Output</code>, just like the giving <code>Iterator</code> returns <code>Self::Item</code>.</p>
<h2 id="fnmut-has-special-syntax"><code>FnMut</code> has special syntax</h2>
<p>You may not be accustomed to seeing the <code>FnMut</code> trait as a regular trait. In fact, on stable Rust, we require you to use special syntax with <code>FnMut</code>. For example, you write <code>impl FnMut(u32) -&gt; bool</code> as a shorthand for <code>FnMut&lt;(u32,), Output = bool&gt;</code>. This is not just for convenience, it&rsquo;s also because we have planned for some time to make changes to the <code>FnMut</code> trait (e.g., to make it variadic, rather than having it take a tuple of argument types), and the special syntax is meant to leave room for that. <strong>Pay attention here:</strong> this special syntax turns out to have an important role.</p>
<h2 id="async-closures-are-a-lending-pattern">Async closures are a lending pattern</h2>
<p><strong>Async</strong> closures are closures that return a future. But that future has to capture <code>self</code>. So that makes them a kind of <strong>lending</strong> trait. Imagine we had a <code>LendingFnMut</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">LendingFnMut</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Output</span><span class="o">&lt;</span><span class="na">&#39;this</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">Self</span>: <span class="na">&#39;this</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">call</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">args</span>: <span class="nc">A</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Output</span><span class="o">&lt;</span><span class="nb">&#39;_</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//      ^                                  ^^^^
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Lends data from `self` as part of return value.
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now we could (not saying we <em>should</em>) express an async closure as a kind of <em>bound</em> on <code>Output</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// Imagine we want something like this...
</span></span></span><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">foo</span><span class="p">(</span><span class="n">x</span>: <span class="nc">async</span><span class="w"> </span><span class="nb">FnMut</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="kt">bool</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="o">..</span><span class="p">.}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// ...that is kind of this:
</span></span></span><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">foo</span><span class="o">&lt;</span><span class="n">F</span><span class="o">&gt;</span><span class="p">(</span><span class="n">f</span>: <span class="nc">F</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">F</span>: <span class="nc">LendingFnMut</span><span class="o">&lt;</span><span class="p">()</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span><span class="w"> </span><span class="n">F</span>::<span class="n">Output</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span>: <span class="nc">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kt">bool</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>What is going on here? We saying first that <code>f</code> is a <em>lending</em> closure that takes no arguments <code>F: LendingFnMut&lt;()&gt;</code>. Note that we are <strong>not</strong> using the special <code>FnMut</code> sugar here, so this constraint says nothing about the value of <code>Output</code>. Then, in the next where-clause, we are specifying that <code>Output</code> implements <code>Future&lt;Output = bool&gt;</code>. Importantly, we never say what <code>F::Output</code> <em>is</em>. Just that it will implement <code>Future</code>. This means that it <strong>could</strong> include references to <code>self</code> (but it doesn&rsquo;t have to).</p>
<p><strong>Note what just happened</strong>. This is effectively a &ldquo;third option&rdquo; for how to desugar some kind of async closures. In my [previous post], I talked about using HKT and about transforming the <code>FnMut</code> trait into an async variant (<code>async FnMut</code>). But here we see that we could also have a <em>lending</em> variant of the trait and then bound the <code>Output</code> of that to implement <code>Future</code>.</p>
<h2 id="closure-syntax-gives-us-more-room-to-maneuver">Closure syntax gives us more room to maneuver</h2>
<p>So, to recap things we have seen:</p>
<ul>
<li>Giving vs lending traits is a fundamental pattern:
<ul>
<li>A giving trait has a return value that <strong>never</strong> borrows from <code>self</code></li>
<li>A lending trait has a return value that <strong>may</strong> borrow from <code>self</code></li>
</ul>
</li>
<li>Giving traits are <em>subtraits</em> of lending traits; i.e., you can view a giving trait as a lending trait that happens not to lend.</li>
<li>We can&rsquo;t convert <code>Iterator</code> to a <em>lending</em> trait &ldquo;in place&rdquo;, because functions that are generic over <code>T: Iterator</code> rely on it being the <em>giving</em> pattern.</li>
<li>Async closures are expressible using a <em>lending</em> variant of <code>FnMut</code>, but not the current trait, which is the <em>giving</em> version.</li>
</ul>
<p>Given the last two points, it might seem logical that we also can&rsquo;t convert <code>FnMut</code> &ldquo;in place&rdquo; to the lending version, and that therefore we have to add some kind of separate trait. In fact, though, this is not true, and the reason is because of the forced closure syntax. In particular, it&rsquo;s not possible to write a function today that is generic over <code>F: FnMut&lt;A&gt;</code> but doesn&rsquo;t specify a specific value for the <code>Output</code> generic type. When you write <code>F: FnMut(u32)</code>, you are actually specifying <code>F: FnMut&lt;(u32,), Output = ()&gt;</code>. It <em>is</em> possible to write generic code that talks about <code>F::Output</code>, but that will always be normalizable to something else, because adding the <code>FnMut</code> bound always includes a value for <code>Output</code>.</p>
<p>In principle, then, we could redefine the <code>Output</code> associated type to take a lifetime parameter and change the desugaring for <code>F: FnMut() -&gt; R</code> to be <code>for&lt;'a&gt; F: FnMut&lt;(), Output&lt;'a&gt; = R&gt;</code>. We would also have to make <code>F::Output</code> be legal even without specifying a value for its lifetime parameter; there are a few ways we could do that.</p>
<h2 id="how-to-interpret-impl-trait-in-the-value-of-an-associated-type">How to interpret impl Trait in the value of an associated type</h2>
<p>Let&rsquo;s imagine that we changed the <code>Fn*</code> to be lending traits, then. That&rsquo;s still not enough to support our original goal:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">take_closure</span><span class="p">(</span><span class="n">x</span>: <span class="nc">impl</span><span class="w"> </span><span class="nb">FnMut</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">impl</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kt">bool</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">//                                 ^^^^
</span></span></span><span class="line"><span class="cl"><span class="c1">// Impl trait is not supported here.
</span></span></span></code></pre></div><p>The problem is that we also have to decide how to desugar <code>impl Trait</code> in this position. The interpretation that we want is not entirely obvious. We could choose to desugar <code>-&gt; impl Future</code> as a bound on the <code>Output</code> type, i.e., to this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">take_closure</span><span class="o">&lt;</span><span class="n">F</span><span class="o">&gt;</span><span class="p">(</span><span class="n">x</span>: <span class="nc">F</span><span class="p">)</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">F</span>: <span class="nb">FnMut</span><span class="o">&lt;</span><span class="p">()</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span><span class="w"> </span><span class="o">&lt;</span><span class="n">F</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="nb">FnMut</span><span class="o">&lt;</span><span class="p">()</span><span class="o">&gt;&gt;</span>::<span class="n">Output</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span>: <span class="nc">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kt">bool</span><span class="o">&gt;</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>If we did this, then the <code>Output</code> value is permitted to capture <code>'a</code>, and hence we are taking advantage of <code>FnMut</code> being a lending closure. This means that, when we call the closure, we have to await the resulting future before we can call again, just like we wanted.</p>
<h3 id="complications">Complications</h3>
<p>Interpreting <code>impl Trait</code> this way is a bit tricky. For one thing, it seems inconsistent with how we interpret <code>impl Trait</code> in a parameter like <code>impl Iterator&lt;Item = impl Debug&gt;</code>. Today, that desugars to two fresh parameters <code>&lt;F, G&gt;</code> where <code>F: Iterator&lt;Item = G&gt;, G: Debug</code>. We could probably change that without breaking real world code, since if the associated type is not a GAT I don&rsquo;t think it matters, but we also permit things like <code>impl Iterator&lt;Item = (impl Debug, impl Debug)&gt;</code> that cannot be expressed as bounds. <a href="https://github.com/rust-lang/rust/issues/52662">RFC #2289</a> proposed a new syntax for these sorts of bounds, such that one would write <code>F: Iterator&lt;Item: Debug&gt;</code> to express the same thing. By analogy, one could imagine writing <code>F: FnMut(): Future&lt;Output = bool&gt;</code>, but that&rsquo;s not consistent with the <code>-&gt; impl Future</code> that we see elsewhere. It feels like there&rsquo;s a bit of a tangle of string to sort out here if we try to go down this road, and I worry about winding up with something that is very confusing for end-users (too many subtle variations).</p>
<h2 id="conclusion">Conclusion</h2>
<p>To recap all the points made in this post:</p>
<ul>
<li>Giving vs lending traits is a fundamental pattern:
<ul>
<li>A giving trait has a return value that <strong>never</strong> borrows from <code>self</code></li>
<li>A lending trait has a return value that <strong>may</strong> borrow from <code>self</code></li>
</ul>
</li>
<li>Giving traits are <em>subtraits</em> of lending traits; i.e., you can view a giving trait as a lending trait that happens not to lend.</li>
<li>We can&rsquo;t convert <code>Iterator</code> to a <em>lending</em> trait &ldquo;in place&rdquo;, because functions that are generic over <code>T: Iterator</code> rely on it being the <em>giving</em> pattern.</li>
<li>Async closures are expressible using a <em>lending</em> variant of <code>FnMut</code>, but not the current trait, which is the <em>giving</em> version.</li>
<li>It is possible to modify the <code>Fn*</code> traits to be &ldquo;lending&rdquo; by changing how we desugar <code>F: Fn</code>, but we have to make it possible to write <code>F::Output</code> even when <code>Output</code> has a lifetime parameter (perhaps only if that parameter is statically known not to be used).</li>
<li>We&rsquo;d also have to interpret <code>FnMut() -&gt; impl Future</code> as being a bound on a possibly lent return type, which would be somewhat inconsistent with how <code>Foo&lt;Bar = impl Trait&gt;</code> is interpreted now (which is as a fresh type).</li>
</ul>
<h2 id="hat-tip">Hat tip</h2>
<p>Tip of the hat to Tyler Mandry &ndash; this post is basically a summary of a conversation we had.</p>
<h2 id="footnotes">Footnotes</h2>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>There is a subtle point here. If you are iterating over, say, a <code>&amp;[T]</code> value, then the <code>Item</code> you get back is an <code>&amp;T</code> and hence borrowed. It may seem strange for me to say that you get ownership of the <code>&amp;T</code>. The key point here is that the <code>&amp;T</code> is borrowed <em>from the collection you are iterating over</em> and not <em>from the iterator itself</em>. In other words, from the point of view of the <em>Iterator</em>, it is copying out a <code>&amp;T</code> reference and handing ownership of the reference to you. Owning the reference does not give you ownership of the data it refers to.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Sometimes called &ldquo;streaming&rdquo; iterators.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Not to mention that GATs remain in an &ldquo;MVP&rdquo; state that is rather unergonomic to use; we&rsquo;re working on it!&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>Of course, Rust&rsquo;s notations for expressing these distinctions involve some &ldquo;accidental complexity&rdquo; of their own, and you might argue that the cure is worse than the disease. Fair enough.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>This example, by the way, demonstrates how the unergonomic state of GAT support. I don&rsquo;t love writing <code>for&lt;'a&gt;</code> all the time.&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><title type="html">Fix my blog, please</title><link href="https://smallcultfollowing.com/babysteps/blog/2023/04/03/fix-my-blog-please/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2023/04/03/fix-my-blog-please/</id><published>2023-04-03T00:00:00+00:00</published><updated>2023-04-03T08:38:00-04:00</updated><content type="html"><![CDATA[<p>It&rsquo;s well known that my blog has some issues. The category links don&rsquo;t work. It renders oddly on mobile. And maybe Safari, too? The Rust snippets are not colored. The RSS feed is apparently not advertised properly in the metadata. It&rsquo;s published via a makefile instead of some hot-rod CI/CD script, and it uses jekyll instead of whatever the new hotness is.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> Being a programmer, you&rsquo;d think I could fix this, but I am intimidated by HTML, CSS, and Github Actions. Hence this call for help: <strong>I&rsquo;d like to hire someone to &ldquo;tune up&rdquo; the blog, a combination of fixing the underlying setup and also the visual layout.</strong> This post will be a rough set of things I have in mind, but I&rsquo;m open to suggestions. If you think you&rsquo;d be up for the job, read on.</p>
<h2 id="desiderata">Desiderata<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></h2>
<p>In short, I am looking for a rad visual designer who also can do the technical side of fixing up my jekyll and CI/CD setup.</p>
<p>Specific works item I have in mind:</p>
<ul>
<li>Syntax highlighting</li>
<li>Make it look great on mobile and safari</li>
<li>Fix the category links</li>
<li>Add RSS feed into metadata and link it, whatever is normal</li>
<li>CI/CD setup so that when I push or land a PR, it deploys automatically</li>
<li>&ldquo;Tune up&rdquo; the layout, but keep the cute picture!<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></li>
</ul>
<p>Bonus points if you can make the setup easier to duplicate. Installing and upgrading Ruby is a horrible pain and I always forget whether I like rbenv or rubyenv or whatever better. Porting over to Hugo or Zola would likely be awesome, so long as links and content can be preserved. I do use some funky jekyll plugins, though I kind of forgot why. Alternatively maybe something with docker?</p>
<h2 id="current-blog-implementation">Current blog implementation</h2>
<p>The blog is a jekyll blog with a custom theme. Sources are here:</p>
<ul>
<li><a href="https://github.com/nikomatsakis/babysteps">https://github.com/nikomatsakis/babysteps</a></li>
<li><a href="https://github.com/nikomatsakis/nikomatsakis-babysteps-theme">https://github.com/nikomatsakis/nikomatsakis-babysteps-theme</a></li>
</ul>
<p>Deployment is done via rsync <a href="https://github.com/nikomatsakis/babysteps/blob/8820df7df4ac5b888ea8adec95c5449750709d7b/babysteps/Makefile#L18">at present</a>.</p>
<h2 id="interested">Interested?</h2>
<p>Send me an <a href="mailto:niko@alum.mit.edu?subject=babysteps+to+beauty">email</a> with your name, some examples of past work, any recommendations etc, and the rate you charge. Thanks!</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>On the other hand, it has that super cute picture of my daughter (from around a decade ago, but still&hellip;). And the content, I like to think, is decent.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>I have a soft spot for wacky plurals, and &ldquo;desiderata&rdquo; might be my fave.  I heard it first from a Dave Herman presentation to TC39 and it&rsquo;s been rattling in my brain ever since, wanting to be used.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Ooooh, I always want nice looking tables like those wizards who style github have. How come my tables are always so ugly?&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><title type="html">Thoughts on async closures</title><link href="https://smallcultfollowing.com/babysteps/blog/2023/03/29/thoughts-on-async-closures/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2023/03/29/thoughts-on-async-closures/</id><published>2023-03-29T00:00:00+00:00</published><updated>2023-03-29T11:41:00-04:00</updated><content type="html"><![CDATA[<p>I&rsquo;ve been thinking about async closures and how they could work once we have static async fn in trait. Somewhat surprisingly to me, I found that async closures are a strong example for where <a href="https://smallcultfollowing.com/babysteps/blog/2023/03/03/trait-transformers-send-bounds-part-3/">async transformers</a> could be an important tool. Let&rsquo;s dive in! We&rsquo;re going to start with the problem, then show why modeling async closures as &ldquo;closures that return futures&rdquo; would require some deep lifetime magic, and finally circle back to how async transformers can make all this &ldquo;just work&rdquo; in a surprisingly natural way.</p>
<h2 id="sync-closures">Sync closures</h2>
<p>Closures are omnipresent in combinator style APIs in Rust. For the purposes of this post, let&rsquo;s dive into a really simple closure function, <code>call_twice_sync</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">call_twice_sync</span><span class="p">(</span><span class="k">mut</span><span class="w"> </span><span class="n">op</span>: <span class="nc">impl</span><span class="w"> </span><span class="nb">FnMut</span><span class="p">(</span><span class="o">&amp;</span><span class="kt">str</span><span class="p">))</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">op</span><span class="p">(</span><span class="s">&#34;Hello&#34;</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">op</span><span class="p">(</span><span class="s">&#34;Rustaceans&#34;</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>As the name suggests, <code>call_twice_sync</code> invokes its argument twice. You might call it from synchronous code like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">buf</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">String</span>::<span class="n">new</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">call_twice_sync</span><span class="p">(</span><span class="o">|</span><span class="n">s</span><span class="o">|</span><span class="w"> </span><span class="n">buf</span><span class="p">.</span><span class="n">push_str</span><span class="p">(</span><span class="n">s</span><span class="p">));</span><span class="w">
</span></span></span></code></pre></div><p>As you might expect, after this code executes, <code>buf</code> will have the value <code>&quot;HelloRustaceans&quot;</code>. <a href="https://play.rust-lang.org/?version=stable&amp;mode=debug&amp;edition=2021&amp;gist=1abb09471dad55daa545761fdd80d71e">(Playground link, if you&rsquo;re curious to try it out.)</a></p>
<h2 id="async-closures-as-closures-that-return-futures">Async closures as closures that return futures</h2>
<p>Suppose we want to allow the closure to do async operations, though. That won&rsquo;t work with <code>call_twice_sync</code> because the closure is a synchronous function:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">buf</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">String</span>::<span class="n">new</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">call_twice_sync</span><span class="p">(</span><span class="o">|</span><span class="n">s</span><span class="o">|</span><span class="w"> </span><span class="n">s</span><span class="p">.</span><span class="n">push_str</span><span class="p">(</span><span class="n">receive_message</span><span class="p">().</span><span class="k">await</span><span class="p">));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">//                                               ----- ERROR
</span></span></span></code></pre></div><p>Given that an async function is just a sync function that returns a future, perhaps we can model an async clousure as a sync closure that returns a future? Let&rsquo;s try it.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">call_twice_async</span><span class="o">&lt;</span><span class="n">F</span><span class="o">&gt;</span><span class="p">(</span><span class="n">op</span>: <span class="nc">impl</span><span class="w"> </span><span class="nb">FnMut</span><span class="p">(</span><span class="o">&amp;</span><span class="kt">str</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">F</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">F</span>: <span class="nc">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">()</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">op</span><span class="p">(</span><span class="s">&#34;Hello&#34;</span><span class="p">).</span><span class="k">await</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">op</span><span class="p">(</span><span class="s">&#34;Rustaceans&#34;</span><span class="p">).</span><span class="k">await</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p><a href="https://play.rust-lang.org/?version=stable&amp;mode=debug&amp;edition=2021&amp;gist=ce12829f92ffcd55b9582db42f2172b5">This compiles</a>. So far so good. Now let&rsquo;s try using it. For now we won&rsquo;t even use an await, just the same sync code we tried before:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// Hint: won&#39;t compile
</span></span></span><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">use_it</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">buf</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">String</span>::<span class="n">new</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">call_twice_async</span><span class="p">(</span><span class="o">|</span><span class="n">s</span><span class="o">|</span><span class="w"> </span><span class="k">async</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">buf</span><span class="p">.</span><span class="n">push_str</span><span class="p">(</span><span class="n">s</span><span class="p">);</span><span class="w"> </span><span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//                   ----- Return a future
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Wait, what&rsquo;s this? Lo and behold, we get an error, and a kind of intimidating one:</p>
<pre tabindex="0"><code>error: captured variable cannot escape `FnMut` closure body
  --&gt; src/lib.rs:13:26
   |
12 |     let mut buf = String::new();
   |         ------- variable defined here
13 |     call_twice_async(|s| async { buf.push_str(s); });
   |                        - ^^^^^^^^---^^^^^^^^^^^^^^^
   |                        | |       |
   |                        | |       variable captured here
   |                        | returns an `async` block that contains a reference to a captured variable, which then escapes the closure body
   |                        inferred to be a `FnMut` closure
   |
   = note: `FnMut` closures only have access to their captured variables while they are executing...
   = note: ...therefore, they cannot allow references to captured variables to escape
</code></pre><p>So what is this all about? The last two lines actually tell you, but to really see it you have to do a bit of desugaring.</p>
<h2 id="futures-capture-the-data-they-will-use">Futures capture the data they will use</h2>
<p>The closure tries to construct a future with an <code>async</code> block. This async block is going to capture a reference to all the variables it needs: in this case, <code>s</code> and <code>buf</code>. So the closure will become something like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="o">|</span><span class="n">s</span><span class="o">|</span><span class="w"> </span><span class="n">MyAsyncBlockType</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">buf</span><span class="p">,</span><span class="w"> </span><span class="n">s</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>where <code>MyAsyncBlockType</code> implements <code>Future</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">MyAsyncBlockType</span><span class="o">&lt;</span><span class="na">&#39;b</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">buf</span>: <span class="kp">&amp;</span><span class="na">&#39;b</span> <span class="nc">mut</span><span class="w"> </span><span class="nb">String</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">s</span>: <span class="kp">&amp;</span><span class="na">&#39;b</span> <span class="kt">str</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Future</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">MyAsyncBlockType</span><span class="o">&lt;</span><span class="nb">&#39;_</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">poll</span><span class="p">(</span><span class="o">..</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p><strong>The key point here is that the closure is returning a struct (<code>MyAsyncBlockType</code>) and this struct is holding on to a reference to both <code>buf</code> and <code>s</code> so that it can use them when it is awaited.</strong></p>
<h2 id="closure-signature-promises-to-be-finished">Closure signature promises to be finished</h2>
<p>The problem is that the <code>FnMut</code> closure signature actually promises something different than what the body does. The <em>signature</em> says that it takes an <code>&amp;str</code> &ndash; this means that the closure is allowed to use the string while it executes, but it cannot hold on to a reference to the string and use it later. The same is true for <code>buf</code>, which will be accessible through the implicit <code>self</code> argument of the closure. But when the closure return the future, it is trying to create references to <code>buf</code> and <code>s</code> that outlive the closure itself! This is why the error message says:</p>
<pre tabindex="0"><code>= note: `FnMut` closures only have access to their captured variables while they are executing...
= note: ...therefore, they cannot allow references to captured variables to escape
</code></pre><p>This is a problem!</p>
<h2 id="add-some-lifetime-arguments">Add some lifetime arguments?</h2>
<p>So maybe we can declare the fact that we hold on to the data? It turns out you <em>almost</em> can, but not quite, and making an async closure be &ldquo;just&rdquo; a sync closure that returns a future would require some rather fundamental extensions to Rust&rsquo;s trait system. There are two variables to consider, <code>buf</code> and <code>s</code>. Let&rsquo;s begin with the argument <code>s</code>.</p>
<h2 id="an-aside-impl-trait-capture-rules">An aside: impl Trait capture rules</h2>
<p>Before we dive more deeply into the closure case, let&rsquo;s back up and imagine a top-level function that returns a future:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">push_buf</span><span class="p">(</span><span class="n">buf</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="nb">String</span><span class="p">,</span><span class="w"> </span><span class="n">s</span>: <span class="kp">&amp;</span><span class="kt">str</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">impl</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">()</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">move</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">buf</span><span class="p">.</span><span class="n">push_str</span><span class="p">(</span><span class="n">s</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>If you try to compile this code, you&rsquo;ll find that it does not build (<a href="https://play.rust-lang.org/?version=stable&amp;mode=debug&amp;edition=2021&amp;gist=75fe203778735be39418daa3d1b1eb0b">playground</a>):</p>
<pre tabindex="0"><code>error[E0700]: hidden type for `impl Future&lt;Output = ()&gt;` captures lifetime that does not appear in bounds
 --&gt; src/lib.rs:4:5
  |
3 |   fn push_buf(buf: &amp;mut String, s: &amp;str) -&gt; impl Future&lt;Output = ()&gt; {
  |                    ----------- hidden type `[async block@src/lib.rs:4:5: 6:6]` captures the anonymous lifetime defined here
4 | /     async move {
5 | |         buf.push_str(s);
6 | |     }
  | |_____^
  |
help: to declare that `impl Future&lt;Output = ()&gt;` captures `&#39;_`, you can introduce a named lifetime parameter `&#39;a`
  |
3 | fn push_buf&lt;&#39;a&gt;(buf: &amp;&#39;a mut String, s: &amp;&#39;a str) -&gt; impl Future&lt;Output = ()&gt; + &#39;a  {
  |            ++++       ++                 ++                                  ++++
</code></pre><p><code>impl Trait</code> values can only capture borrowed data if they explicitly name the lifetime. This is why the suggested fix is to use a named lifetime <code>'a</code> for <code>buf</code> and <code>s</code> and declare that the <code>Future</code> captures it:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">push_buf</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span><span class="p">(</span><span class="n">buf</span>: <span class="kp">&amp;</span><span class="na">&#39;a</span> <span class="nc">mut</span><span class="w"> </span><span class="nb">String</span><span class="p">,</span><span class="w"> </span><span class="n">s</span>: <span class="kp">&amp;</span><span class="na">&#39;a</span> <span class="kt">str</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">impl</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">()</span><span class="o">&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="na">&#39;a</span><span class="w"> 
</span></span></span></code></pre></div><p>If you desugar this return position impl trait into an explicit type alias impl trait, you can see the captures more clearly, as they become parameters to the <code>type</code>. The original (no captures) would be:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">type</span> <span class="nc">PushBuf</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">impl</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">()</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">push_buf</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span><span class="p">(</span><span class="n">buf</span>: <span class="kp">&amp;</span><span class="na">&#39;a</span> <span class="nc">mut</span><span class="w"> </span><span class="nb">String</span><span class="p">,</span><span class="w"> </span><span class="n">s</span>: <span class="kp">&amp;</span><span class="na">&#39;a</span> <span class="kt">str</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">PushBuf</span><span class="w">
</span></span></span></code></pre></div><p>and the fixed version would be:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">type</span> <span class="nc">PushBuf</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">impl</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">()</span><span class="o">&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="na">&#39;a</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">push_buf</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span><span class="p">(</span><span class="n">buf</span>: <span class="kp">&amp;</span><span class="na">&#39;a</span> <span class="nc">mut</span><span class="w"> </span><span class="nb">String</span><span class="p">,</span><span class="w"> </span><span class="n">s</span>: <span class="kp">&amp;</span><span class="na">&#39;a</span> <span class="kt">str</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">PushBuf</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span><span class="w">
</span></span></span></code></pre></div><h2 id="from-functions-to-closures">From functions to closures</h2>
<p>OK, so we just saw how we can define a function that returns an <code>impl Future</code>, how that future will wind up capturing the arguments, and how that is made explicit in the return type by references to a named lifetime <code>'a</code>. We could do something similar for closures, although Rust&rsquo;s rather limited support for explicit closure syntax makes it awkward. I&rsquo;ll use the unimplemented syntax from <a href="https://github.com/rust-lang/rfcs/pull/3216">RFC 3216</a>, you can <a href="https://play.rust-lang.org/?version=nightly&amp;mode=debug&amp;edition=2021&amp;gist=7a06bc923e23d187fc1cf8db3af50af1">see the workaround on the playground</a> if that&rsquo;s your thing:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">type</span> <span class="nc">PushBuf</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">impl</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">()</span><span class="o">&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="na">&#39;a</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">test</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">c</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">for</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span><span class="w"> </span><span class="o">|</span><span class="n">buf</span>: <span class="kp">&amp;</span><span class="na">&#39;a</span> <span class="nc">mut</span><span class="w"> </span><span class="nb">String</span><span class="p">,</span><span class="w"> </span><span class="n">s</span>: <span class="kp">&amp;</span><span class="na">&#39;a</span> <span class="kt">str</span><span class="o">|</span><span class="w"> </span>-&gt; <span class="nc">PushBuf</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">async</span><span class="w"> </span><span class="k">move</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">buf</span><span class="p">.</span><span class="n">push_str</span><span class="p">(</span><span class="n">s</span><span class="p">)</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">buf</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">String</span>::<span class="n">new</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">c</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">buf</span><span class="p">,</span><span class="w"> </span><span class="s">&#34;foo&#34;</span><span class="p">).</span><span class="k">await</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>(Side note that this is an interesting case for the <a href="https://github.com/rust-lang/rust/issues/107645">&ldquo;currently under debate&rdquo; rules around defining type alias impl trait</a>.)</p>
<h2 id="now-for-the-hammer">Now for the HAMMER</h2>
<p>OK, so far so grody, but we&rsquo;ve shown that indeed you <em>could</em> define a closure that returns a future and it seems like things would work. But now comes the problem. Let&rsquo;s take a look at the <code>call_twice_async</code> function &ndash; i.e., instead of looking at where the closure is defined, we look at the function that takes the closure as argument. That&rsquo;s where things get tricky.</p>
<p>Here is <code>call_twice_async</code>, but with the anonymous lifetime given an explicit name <code>'a</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">call_twice_async</span><span class="o">&lt;</span><span class="n">F</span><span class="o">&gt;</span><span class="p">(</span><span class="n">op</span>: <span class="nc">impl</span><span class="w"> </span><span class="k">for</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">FnMut</span><span class="p">(</span><span class="o">&amp;</span><span class="kt">str</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">F</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">F</span>: <span class="nc">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">()</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span></code></pre></div><p>Now the problem is this: we need to declare that the future which is returned (<code>F</code>) might capture <code>'a</code>. But <code>F</code> is declared in an outer scope, and it can&rsquo;t name <code>'a</code>. In other words, right now, the return type <code>F</code> of the closure <code>op</code> must be the same each time the closure is called, but to get the semantics we want, we need the return type to include a different value for <code>'a</code> each time.</p>
<p>If Rust had higher-kinded types (HKT), you could do something a bit wild, like this&hellip;</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">call_twice_async</span><span class="o">&lt;</span><span class="n">F</span><span class="o">&lt;</span><span class="nb">&#39;_</span><span class="o">&gt;&gt;</span><span class="p">(</span><span class="n">op</span>: <span class="nc">impl</span><span class="w"> </span><span class="k">for</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">FnMut</span><span class="p">(</span><span class="o">&amp;</span><span class="na">&#39;a</span><span class="w"> </span><span class="kt">str</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">F</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">//                  ----- HKT
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span><span class="w"> </span><span class="n">F</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span>: <span class="nc">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">()</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span></code></pre></div><p>but, of course, we <em>don&rsquo;t</em> have HKT (and, cool as they are, I don&rsquo;t think that&rsquo;s a good fit for Rust right now, it would bust our complexity barrier in my opinion and then some without near enough payoff).</p>
<p>Short of adding HKT or some equivalent, I believe the option workaround is to use a <code>dyn</code> type:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">call_twice_async</span><span class="p">(</span><span class="n">op</span>: <span class="nc">impl</span><span class="w"> </span><span class="k">for</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">FnMut</span><span class="p">(</span><span class="o">&amp;</span><span class="na">&#39;a</span><span class="w"> </span><span class="kt">str</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Box</span><span class="o">&lt;</span><span class="k">dyn</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">()</span><span class="o">&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="na">&#39;a</span><span class="o">&gt;</span><span class="p">)</span><span class="w">
</span></span></span></code></pre></div><p>This works today (and it is, for example, what <a href="https://github.com/nikomatsakis/moro/blob/6aa675e4b1676e21291296687f1d9ff40984b866/src/lib.rs#L145">moro does</a> to resolve exactly this problem). Of course that means that the closure has to allocate a box, instead of just returning an async move. That&rsquo;s a non-starter.</p>
<p>So we&rsquo;re kind of stuck. As far as I can tell, modeling async closures as &ldquo;normal closures that happen to return futures&rdquo; requires one of two unappealing options</p>
<ul>
<li>extend the language with HKT, or possibly some syntactic sugar that ultimately however desugars to HKT</li>
<li>use <code>Box&lt;dyn&gt;</code> everywhere, giving up on zero cost futures, embedded use cases, etc.</li>
</ul>
<h2 id="more-traits-less-problems">More traits, less problems</h2>
<p>But wait, there is another way. Instead of modeling async closures using the normal <code>Fn</code> traits, we could define some <em>async</em> closure traits. To keep our life simple, let&rsquo;s just look at one, for <code>FnMut</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">AsyncFnMut</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Output</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">call</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">args</span>: <span class="nc">A</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Output</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This is identical to the [sync <code>FnMut</code>] trait, except that <code>call</code> is an <code>async fn</code>. But that&rsquo;s a pretty important difference. If we desugar the <code>async fn</code> to one using impl Trait, and then to GATs, we can start to see why:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">AsyncFnMut</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Output</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Call</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span>: <span class="nc">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">Self</span>::<span class="n">Output</span><span class="o">&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="na">&#39;a</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">call</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">args</span>: <span class="nc">A</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Call</span><span class="o">&lt;</span><span class="nb">&#39;_</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Notice the Generic Associated Type (GAT) <code>Call</code>. GATs are basically the Rusty way to do HKTs (if you want to go deeper, I <a href="https://smallcultfollowing.com/babysteps/blog/2016/11/02/associated-type-constructors-part-1-basic-concepts-and-introduction/">wrote</a> <a href="https://smallcultfollowing.com/babysteps/blog/2016/11/03/associated-type-constructors-part-2-family-traits/">a</a> <a href="https://smallcultfollowing.com/babysteps/blog/2016/11/04/associated-type-constructors-part-3-what-higher-kinded-types-might-look-like/">comparison</a> <a href="https://smallcultfollowing.com/babysteps/blog/2016/11/09/associated-type-constructors-part-4-unifying-atc-and-hkt/">series</a> which may help; back then we called them associated type constructors, not GATs). <strong>Essentially what has happened here is that we moved the &ldquo;HKT&rdquo; into the trait definition itself, instead of forcing the caller to have it.</strong></p>
<p>Given this definition, when we try to write the &ldquo;call twice async&rdquo; function, things work out more smoothly:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">call_twice_async</span><span class="o">&lt;</span><span class="n">F</span><span class="o">&gt;</span><span class="p">(</span><span class="k">mut</span><span class="w"> </span><span class="n">op</span>: <span class="nc">impl</span><span class="w"> </span><span class="n">AsyncFnMut</span><span class="p">(</span><span class="o">&amp;</span><span class="kt">str</span><span class="p">))</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">op</span><span class="p">.</span><span class="n">call</span><span class="p">(</span><span class="s">&#34;Hello&#34;</span><span class="p">).</span><span class="k">await</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">op</span><span class="p">.</span><span class="n">call</span><span class="p">(</span><span class="s">&#34;World&#34;</span><span class="p">).</span><span class="k">await</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p><a href="https://play.rust-lang.org/?version=nightly&amp;mode=debug&amp;edition=2021&amp;gist=082006281e7b30f112c16e2a4b1d334c">Try it out on the playground, though note that we don&rsquo;t actually support the <code>()</code> sugar for arbitrary traits, so I wrote <code>impl for&lt;'a&gt; AsyncFnMut&lt;&amp;'a str, Output = ()&gt;</code> instead.</a></p>
<h2 id="connection-to-trait-transformers">Connection to trait transformers</h2>
<p>The translation between the normal <code>FnMut</code> trait and the <code>AsyncFnMut</code> trait was pretty automatic. The only thing we did was change the &ldquo;call&rdquo; function to <code>async</code>. So what if we had an <a href="https://smallcultfollowing.com/babysteps/blog/2023/03/03/trait-transformers-send-bounds-part-3/">async trait transformer</a>, as was discussed earlier? Then we only have one &ldquo;maybe async&rdquo; trait, <code>FnMut</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cp">#[maybe(async)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="nb">FnMut</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Output</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="cp">#[maybe(async)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">call</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">args</span>: <span class="nc">A</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Output</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now we can write <code>call_twice</code> either sync or async, as we like, and the code is virtually identical. The only difference is that I write <code>impl FnMut</code> for sync or <code>impl async FnMut</code> for async:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">call_twice_sync</span><span class="o">&lt;</span><span class="n">F</span><span class="o">&gt;</span><span class="p">(</span><span class="k">mut</span><span class="w"> </span><span class="n">op</span>: <span class="nc">impl</span><span class="w"> </span><span class="nb">FnMut</span><span class="p">(</span><span class="o">&amp;</span><span class="kt">str</span><span class="p">))</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">op</span><span class="p">.</span><span class="n">call</span><span class="p">(</span><span class="s">&#34;Hello&#34;</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">op</span><span class="p">.</span><span class="n">call</span><span class="p">(</span><span class="s">&#34;World&#34;</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">call_twice_async</span><span class="o">&lt;</span><span class="n">F</span><span class="o">&gt;</span><span class="p">(</span><span class="k">mut</span><span class="w"> </span><span class="n">op</span>: <span class="nc">impl</span><span class="w"> </span><span class="k">async</span><span class="w"> </span><span class="nb">FnMut</span><span class="p">(</span><span class="o">&amp;</span><span class="kt">str</span><span class="p">))</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">op</span><span class="p">.</span><span class="n">call</span><span class="p">(</span><span class="s">&#34;Hello&#34;</span><span class="p">).</span><span class="k">await</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">op</span><span class="p">.</span><span class="n">call</span><span class="p">(</span><span class="s">&#34;World&#34;</span><span class="p">).</span><span class="k">await</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Of course, with a more general maybe-async design, we might just write this function once, but that&rsquo;s separate concern. Right now I&rsquo;m only concerned with the idea of authoring traits that can be used in two modes, but not necessarily with writing code that is generic over which mode is being used.</p>
<h2 id="final-note-creating-the-closure-in-a-maybe-async-world">Final note: creating the closure in a maybe-async world</h2>
<p>When calling <code>call_twice</code>, we could write <code>|s| buf.push_str(s)</code> or <code>async |s| buf.push_str(s)</code> to indicate which traits it implements, but we could also infer this from context. We already do similar inference to decide the type of <code>s</code> for example. In fact, we could have some blanket impls, so that every <code>F: FnMut</code> also implements <code>F: async FnMut</code>; I guess this is generally true for any trait.</p>
<h2 id="conclusion">Conclusion</h2>
<p>My conclusions:</p>
<ul>
<li>Nothing in this discussion required or even suggested any changes to the underlying design of async fn in trait. Stabilizing the statically dispatched subset of async fn in trait should be forwards compatible with supporting async closures. &#x1f389;</li>
<li>The &ldquo;higher-kinded-ness&rdquo; of async closures has to go somewhere. In stabilizing GATs, in my view, we&rsquo;ve committed to the path that it should go into the trait definition (vs HKT, which would push it to the use site). The standard &ldquo;def vs use site&rdquo; tradeoffs apply here, I think: def sites often feel simpler and easier to understand, but are less flexible. I think that&rsquo;s fine.</li>
<li>Async trait transformers feel like a great option here that makes async closures work just like you would expect.</li>
</ul>
]]></content></entry><entry><title type="html">Must move types</title><link href="https://smallcultfollowing.com/babysteps/blog/2023/03/16/must-move-types/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2023/03/16/must-move-types/</id><published>2023-03-16T00:00:00+00:00</published><updated>2023-03-16T18:32:00-04:00</updated><content type="html"><![CDATA[<p>Rust has lots of mechanisms that prevent you from doing something bad. But, right now, it has NO mechanisms that force you  to do something <em>good</em><sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>. I’ve been thinking lately about what it would mean to add “must move” types to the language. This is an idea that I’ve long resisted, because it represents a fundamental increase to complexity. But lately I’m seeing more and more problems that it would help to address, so I wanted to try and think what it might look like, so we can better decide if it&rsquo;s a good idea.</p>
<h2 id="must-move">Must move?</h2>
<p>The term ‘must move’ type is not standard. I made it up. The more usual name in PL circles is a “linear” type, which means a value that must be used exactly once. The idea of a <em>must move</em> type <code>T</code> is that, if some function <code>f</code> has a value <code>t</code> of type <code>T</code>, then <code>f</code> <em>must move</em> <code>t</code> before it returns (modulo panic, which I discuss below). Moving <code>t</code> can mean either calling some other function that takes ownership of <code>t</code>, returning it, or — as we’ll see later — destructuring it via pattern matching.</p>
<p>Here are some examples of functions that <em>move</em> the value <code>t</code>. You can return it…</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">return_it</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="n">t</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">t</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>…call a function that takes ownership of it…</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">send_it</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="n">t</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">channel</span><span class="p">.</span><span class="n">send</span><span class="p">(</span><span class="n">t</span><span class="p">);</span><span class="w"> </span><span class="c1">// takes ownership of `t`
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>…or maybe call a constructor function that takes ownership of it (which would usually mean you must “recursively” move the result)…</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">return_opt</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="n">t</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nb">Some</span><span class="p">(</span><span class="n">t</span><span class="p">)</span><span class="w"> </span><span class="c1">// moves t into the option
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="doesnt-rust-have-linear-types-already">Doesn’t Rust have “linear types” already?</h2>
<p>You may have heard that Rust’s ownership and borrowing is a form of “linear types”. That’s not really true. Rust has <em>affine types</em>, which means a value that can be moved <em>at most</em> once. But we have nothing that forces you to move a value. For example, I can write the <code>consume</code> function in Rust today:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">consume</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="n">t</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="cm">/* look ma, no .. nothin&#39; */</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This function takes a value <code>t</code> of (almost, see below) any type <code>T</code> and…does nothing with it. This is not possible with linear types. If <code>T</code> were <em>linear</em>, we would have to do <em>something</em> with <code>t</code> — e.g., move it somewhere. This is why I call linear types <em>must move</em>.</p>
<h2 id="what-about-the-destructor">What about the destructor?</h2>
<p>“Hold up!”, you’re thinking, “<code>consume</code> doesn’t actually do <em>nothing</em> with <code>t</code>. It drops <code>t</code>, executing its destructor!” Good point. That’s true. But <code>consume</code> isn’t actually required to execute the destructor; you can always use <code>forget</code> to avoid it<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">consume</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="n">t</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">std</span>::<span class="n">mem</span>::<span class="n">forget</span><span class="p">(</span><span class="n">t</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>If weren’t possible to “forget” values, destructors would mean that Rust had a linear system, but even then, it would only be in a technical sense. In particular, destructors would be a required action, but of a limited form — they can’t, for example, take arguments. Nor can they be async.</p>
<h2 id="what-about-sized">What about <code>Sized</code>?</h2>
<p>There is one other detail about the <code>consume</code> type worth mentioning. When I write <code>fn consume&lt;T&gt;(t: T)</code>, that is actually <em>shorthand</em> for saying “any type <code>T</code> that is <code>Sized</code>”. In other words, the fully elaborated “do nothing with a value” function looks like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">consume</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Sized</span><span class="o">&gt;</span><span class="p">(</span><span class="n">t</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">std</span>::<span class="n">mem</span>::<span class="n">forget</span><span class="p">(</span><span class="n">t</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>If you don’t want this default <code>Sized</code> bound, you write <code>T: ?Sized</code>. The leading <code>?</code> means “maybe Sized” — i.e., now <code>T</code> can any type, whether it be sized (e.g., <code>u32</code>) or unsized (e.g., <code>[u32]</code>).</p>
<p><strong>This is important:</strong> a where-clause like <code>T: Foo</code> <em>narrows</em> the set of types that <code>T</code> can be, since now it <em>must</em> be a type that implements <code>Foo</code>. The “maybe” where-clause <code>T: ?Sized</code> (we don’t accept other traits here) <em>broadens</em> the set of types that <code>T</code> can be, by removing default bounds.</p>
<h2 id="so-how-would-must-move-work">So how would “must move” work?</h2>
<p>You might imagine that we could encode “must move” types via a new kind of bound, e.g., <code>T: MustMove</code>. But that’s actually backwards. The problem is that “must move” types are actually a superset of ordinary types — after all, if you have an ordinary type, it’s still ok to write a function that always moves it. But it’s <em>also</em> ok to have a function that drops it or forgets it. In contrast, with a “must move” type, the only option is to move it. <strong>This implies that what we want is a <code>?</code> bound, not a normal bound.</strong></p>
<p>The notation I propose is <code>?Drop</code>. The idea is that, by default, every type parameter <code>D</code> is assumed to be <em>droppable</em>, meaning that you can always choose to drop it at any point. But a <code>M: ?Drop</code> parameter is <em>not necessarily droppable</em>. You must ensure that a value of type <code>M</code> is moved somewhere else.</p>
<p>Let’s see a few examples to get the idea of it. To start, the <code>identity</code> function, which just returns its argument, could be declared with <code>?Drop</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">identity</span><span class="o">&lt;</span><span class="n">M</span>: <span class="o">?</span><span class="nb">Drop</span><span class="o">&gt;</span><span class="p">(</span><span class="n">m</span>: <span class="nc">M</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">M</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">m</span><span class="w"> </span><span class="c1">// OK — moving `m` to the caller
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>But the <code>consume</code> function could not:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">consume</span><span class="o">&lt;</span><span class="n">M</span>: <span class="o">?</span><span class="nb">Drop</span><span class="o">&gt;</span><span class="p">(</span><span class="n">m</span>: <span class="nc">M</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">M</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// ERROR: `M` is not moved.
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>You might think that the version of <code>consume</code> which calls <code>mem::forget</code> is sound — after all, <code>forget</code> is declared like so</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">forget</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="n">t</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="cm">/* compiler magic to avoid dropping */</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Therefore, if  <code>consume</code> were to call <code>forget(m)</code>, wouldn’t that count as a move? The answer is yes, it would, but we <em>still</em> get an error. This is because <code>forget</code> is not declared with <code>?Drop</code>, and therefore there is an implicit <code>T: Drop</code> where-clause:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">consume</span><span class="o">&lt;</span><span class="n">M</span>: <span class="o">?</span><span class="nb">Drop</span><span class="o">&gt;</span><span class="p">(</span><span class="n">m</span>: <span class="nc">M</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">M</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">forget</span><span class="p">(</span><span class="n">m</span><span class="p">);</span><span class="w"> </span><span class="c1">// ERROR: `forget` requires `M: Drop`, which isn’t known to hold.
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="declaring-types-to-be-drop">Declaring types to be <code>?Drop</code></h2>
<p>Under this scheme, all structs and types you declare would be droppable by default. If you don’t implement <code>Drop</code> explicitly, the compiler adds an automatic <code>Drop</code> impl for you that just recursively drops your fields. But you could explicitly declare your type to be <code>?Drop</code> by using a <a href="https://github.com/rust-lang/rust/issues/68318">negative impl</a>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Guard</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">value</span>: <span class="kt">u32</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="o">!</span><span class="nb">Drop</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Guard</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>When you do this, the type becomes “must move” and any function which has a value of type <code>Guard</code> must either move it somewhere else. You might wonder then how you ever terminate — the answer is that one way to “move” the value is to unpack it with a pattern. For example, <code>Guard</code> might declare a <code>log</code> method:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Guard</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">log</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">message</span>: <span class="kp">&amp;</span><span class="kt">str</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">Guard</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">value</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">;</span><span class="w"> </span><span class="c1">// moves “self”
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="fm">println!</span><span class="p">(</span><span class="err">“</span><span class="p">{</span><span class="n">value</span><span class="p">}</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">{</span><span class="n">message</span><span class="p">}</span><span class="err">”</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This plays nicely with privacy: if your type have private fields, only functions within that module will be able to destruct it, everyone else must (eventually) discharge their obligation to move by invoking some function within your module.</p>
<h2 id="interactions-between-must-move-and-control-flow">Interactions between “must move” and control-flow</h2>
<p>Must move values interact with control-flow like <code>?</code>. Consider the <code>Guard</code> type from the previous section, and imagine I have a function like this one…</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">execute</span><span class="p">(</span><span class="n">t</span>: <span class="nc">Guard</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;</span><span class="p">(),</span><span class="w"> </span><span class="n">std</span>::<span class="n">io</span>::<span class="n">Error</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">s</span>: <span class="nb">String</span> <span class="o">=</span><span class="w"> </span><span class="n">read_file</span><span class="p">(</span><span class="err">“</span><span class="n">message</span><span class="p">.</span><span class="n">txt</span><span class="err">”</span><span class="p">)</span><span class="o">?</span><span class="p">;</span><span class="w">  </span><span class="c1">// ERROR: `t` is not moved on error
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">t</span><span class="p">.</span><span class="n">log</span><span class="p">(</span><span class="o">&amp;</span><span class="n">s</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nb">Ok</span><span class="p">(())</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This code would not compile. The problem is that the <code>?</code> in <code>read_file</code> may return with an <code>Err</code> result, in which case the call to <code>t.log</code> would not execute! This is a good error, in the sense that it is helping us ensure that the <code>log</code> call to <code>Guard</code> is invoked, but you can imagine that it’s going to interact with other things. To fix the error, you should do something like this…</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">execute</span><span class="p">(</span><span class="n">t</span>: <span class="nc">Guard</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;</span><span class="p">(),</span><span class="w"> </span><span class="n">std</span>::<span class="n">io</span>::<span class="n">Error</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">match</span><span class="w"> </span><span class="n">read_file</span><span class="p">(</span><span class="err">“</span><span class="n">message</span><span class="p">.</span><span class="n">txt</span><span class="err">”</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nb">Ok</span><span class="p">(</span><span class="n">s</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">		</span><span class="n">t</span><span class="p">.</span><span class="n">log</span><span class="p">(</span><span class="o">&amp;</span><span class="n">s</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">		</span><span class="nb">Ok</span><span class="p">(())</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nb">Err</span><span class="p">(</span><span class="n">e</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">t</span><span class="p">.</span><span class="n">log</span><span class="p">(</span><span class="err">“</span><span class="n">error</span><span class="err">”</span><span class="p">);</span><span class="w"> </span><span class="c1">// now `t` is moved
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nb">Err</span><span class="p">(</span><span class="n">e</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Of course, you could also opt to pass back the <code>t</code> value to the caller, making it their problem.</p>
<h2 id="conditional-must-move-types">Conditional “must move” types</h2>
<p>Talking about types like <code>Option</code> and <code>Result</code> — it’s clear that we are going to want to be able to have types that are <em>conditionally</em> must move —  i.e., must move only if their type parameter is “must move”. That’s easy enough to do:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">enum</span> <span class="nb">Option</span><span class="o">&lt;</span><span class="n">T</span>: <span class="o">?</span><span class="nb">Drop</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nb">Some</span><span class="p">(</span><span class="n">T</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nb">None</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Some of the methods on <code>Option</code> work just fine:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="o">?</span><span class="nb">Drop</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">map</span><span class="o">&lt;</span><span class="n">U</span>: <span class="o">?</span><span class="nb">Drop</span><span class="o">&gt;</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">op</span>: <span class="nc">impl</span><span class="w"> </span><span class="nb">FnOnce</span><span class="p">(</span><span class="n">T</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">U</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="n">U</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">match</span><span class="w"> </span><span class="bp">self</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nb">Some</span><span class="p">(</span><span class="n">t</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">op</span><span class="p">(</span><span class="n">t</span><span class="p">)),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nb">None</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="nb">None</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Other methods would require a <code>Drop</code> bound, such as <code>unwrap_or</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="o">?</span><span class="nb">Drop</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">unwrap_or</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">default</span>:<span class="nc">T</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">T</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">T</span>: <span class="nb">Drop</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">match</span><span class="w"> </span><span class="bp">self</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// OK
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nb">None</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">default</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// Without the `T: Drop` bound, we are not allowed to drop `default` here.
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nb">Some</span><span class="p">(</span><span class="n">v</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">v</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">       </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="must-move-and-panic">“Must move” and panic</h2>
<p>One very interesting question is what to do in the case of panic. This is tricky! Ordinarily, a <code>panic</code> will unwind all stack frames, executing destructors. But what should we do for a <code>?Drop</code> type that doesn’t <em>have</em> a destructor?</p>
<p>I see a few options:</p>
<ul>
<li>Force an abort. Seems bad.</li>
<li>Deprecate and remove unwinding, limit to panic=abort. A more honest version of the previous one. Still seems bad, though dang would it make life easier.</li>
<li>Provide some kind of fallback option.</li>
</ul>
<p>The last one is most appealing, but I’m not 100% sure how it works. It may mean that we don’t want to have the “must move” opt-in be to <code>impl !Drop</code> but rather to <code>impl MustMove</code>, or something like that, which would provide a method that is invoked on the case of panic (this method could, of course, choose to abort). The idea of fallback might also be used to permit cancellation with the <code>?</code> operator or other control-flow drops (though I think we definitely want types that don’t permit cancellation in those cases).</p>
<h2 id="must-move-and-trait-objects">“Must move” and trait objects</h2>
<p>What do we do with <code>dyn</code>? I think the answer is that <code>dyn Foo</code> defaults to <code>dyn Foo + Drop</code>, and hence requires that the type be droppable. To create a “must move” dyn, we could permit <code>dyn Foo + ?Drop</code>. To make that really work out, we’d have to have <code>self</code> methods to consume the dyn (though today you can do that via <code>self: Box&lt;Self&gt;</code> methods).</p>
<h2 id="uses-for-must-move">Uses for “must move”</h2>
<p>Contra to best practices, I suppose, I’ve purposefully kept this blog post focused on the mechanism of must move and not talked much about the motivation. This is because I’m not really trying to sell anyone on the idea, at least not yet, I just wanted to sketch some thoughts about how we might achieve it. That said, let me indicate why I am interested in “must move” types.</p>
<p>First, async drop: right now, you cannot have destructors in async code that perform awaits. But this means that async code is not able to manage cleanup in the same way that sync code does. Take a look at the <a href="https://rust-lang.github.io/wg-async/vision/submitted_stories/status_quo/alan_finds_database_drops_hard.html">status quo story about dropping database handles</a> to get an idea of the kinds of problems that arise. Adding async drop itself isn’t that hard, but what’s really hard is guaranteeing that types with async drop are not dropped in sync code, as documented at length in <a href="https://sabrinajewson.org/blog/async-drop">Sabrina Jewson&rsquo;s blog post</a>. This is precisely because we currently assume that <em>all</em> types are droppable. The simplest way to achieve “async drop” then would to define a trait <code>trait AsyncDrop { async fn async_drop(self); }</code> and then make the type “must move”. This will force callers to eventually invoke <code>async_drop(x).await</code>. We might want some syntactic sugar to handle <code>?</code> more easily, but that could come later.</p>
<p>Second, parallel structured concurrency. As Tyler Mandry <a href="https://tmandry.gitlab.io/blog/posts/2023-03-01-scoped-tasks/">elegant documented</a>, if we want to mix parallel scopes and async, we need some way to have futures that cannot be forgotten. The way I think of it is like this: in <em>sync</em> code, when you create a local variable <code>x</code> on your stack, you have a guarantee from the language that it’s destructor will eventually run, unless you move it. In async code, you have no such guarantee, as your entire future could just be forgotten by a caller. “Must move” types solve this problem (with some kind of callback for panic) give us a tool to solve this problem, by having the future type be <code>?Drop</code> — this is effectively a principled way to integrate completion-style futures that must be fully polled.</p>
<p>Finally, “liveness conditions writ large”. As I noted in the beginning, Rust’s type system today is pretty good at letting you guarantee “safety” properties (“nothing bad happens”), but it’s much less useful for <em>liveness</em> properties (“something good eventually happens”). Destructors let you get close, but they can be circumvented. And yet I see liveness properties cropping up all over the place, often in the form of guards or cleanup that really ought to happen. Any time you’ve ever wanted to have a destructor that takes an argument, that applies. This comes up a lot in unsafe code, in particular. Being able to “log” those obligations via “must move” types feels like a really powerful tool that will be used in many different ways.</p>
<h2 id="parting-thoughts">Parting thoughts</h2>
<p>This post sketches out one way to get “true linear” types in Rust, which I’ve dubbed as “must move” types. I think I would call this the <code>?Drop</code> approach, because the basic idea is to allow types to “opt out” from being “droppable” (in which case they must be moved). This is not the only approach we could use. One of my goals with this blog post is to start collecting ideas for different ways to add linear capabilities, so that we can compare them with one another.</p>
<p>I should also address the obvious “elephant in the room”. The Rust type system is already complex, and adding “must move” types will unquestionably make it more complex. I’m not sure yet whether the tradeoff is worth it: it’s hard to judge without trying the system out. I think there’s a good chance that “must move” types live “on the edges” of the type system, through things like guards and so forth that are rarely abstracted over. I think that when you are dealing with concrete types, like the <code>Guard</code> example, must move types won’t feel particularly complicated. It will just be a helpful lint saying “oh, by the way, you are supposed to clean this up properly”. But where pain will arise is when you are trying to build up generic functions — and of course just in the sense of making the Rust language that much bigger. Things like <code>?Sized</code> definitely make the language feel more complex, even if you never have to interact with them directly.</p>
<p>On the other hand, “must move” types definitely add value in the form of preventing very real failure modes. I continue to feel that Rust’s goal, above all else, is “productive reliability”, and that we should double down on that strength. Put another way, I think that the complexity that comes from reasoning about “must move” types is, in large part, <em>inherent complexity</em>, and I feel ok about extending the language with new tools for that. We saw this with the interaction with the <code>?</code> operator — no doubt it’s annoying to have to account for moves and cleanup when an error occurs, but it’s also a a key part of building a robust system, and destructors don’t always cut it.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Well, apart from the &ldquo;must use&rdquo; lint.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Or create a Rc-cycle, if that’s more your speed.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/must-move" term="must-move" label="Must move"/></entry><entry><title type="html">Temporary lifetimes</title><link href="https://smallcultfollowing.com/babysteps/blog/2023/03/15/temporary-lifetimes/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2023/03/15/temporary-lifetimes/</id><published>2023-03-15T00:00:00+00:00</published><updated>2023-03-15T14:22:00-04:00</updated><content type="html"><![CDATA[<p>In <a href="https://github.com/rust-lang/lang-team/blob/master/design-meeting-minutes/2023-03-15-temporary-lifetimes.md">today&rsquo;s lang team design meeting</a>, we reviewed a doc I wrote about temporary lifetimes in Rust. The current rules were <a href="http://smallcultfollowing.com/babysteps/blog/2014/01/09/rvalue-lifetimes-in-rust/">established in a blog post I wrote in 2014</a>. Almost a decade later, we&rsquo;ve seen that they have some rough edges, and in particular can be a common source of bugs for people. The Rust 2024 Edition gives us a chance to address some of those rough edges. This blog post is a copy of the document that the lang team reviewed. It&rsquo;s not a <em>proposal</em>, but it covers some of what works well and what doesn&rsquo;t, and includes a few sketchy ideas towards what we could do better.</p>
<h2 id="summary">Summary</h2>
<p>Rust&rsquo;s rules on temporary lifetimes often work well but have some sharp edges. The 2024 edition offers us a chance to adjust these rules. Since those adjustments change the times when destructors run, they must be done over an edition.</p>
<h2 id="design-principles">Design principles</h2>
<p>I propose the following design principles to guide our decision.</p>
<ul>
<li><strong>Independent from borrow checker:</strong> We need to be able to figure out when destructors run without consulting the borrow checker. This is a slight weakening of the original rules, which required that we knew when destructors would run without consulting results from name resolution or type check.</li>
<li><strong>Shorter is more reliable and predictable:</strong> In general, we should prefer shorter temporary lifetimes, as that results in more reliable and predictable programs.
<ul>
<li><em>Editor&rsquo;s note:</em> A number of people in the lang questions this point. The reasoning is as follows. First, a lot of the problems in practice come from locks that are held longer than expected. Second, problems that come from temporaries being dropped too <em>early</em> tend to manifest as borrow check errors. Therefore, they don&rsquo;t cause reliability issues, but rather ergonomic ones.</li>
</ul>
</li>
<li><strong>Longer is more convenient:</strong> Extending temporary lifetimes where we can do so safely gives more convenience and is key for some patterns.
<ul>
<li><em>Editor&rsquo;s note:</em> As noted in the previous bullet, our current rules sometimes give temporary lifetimes that are shorter than what the code requires, but these generally surface as borrow check errors.</li>
</ul>
</li>
</ul>
<h3 id="equivalences-and-anti-equivalences">Equivalences and anti-equivalences</h3>
<p>The rules should ensure that <code>E</code> and <code>(E)</code>, for any expression <code>E</code>, result in temporaries with the same lifetimes.</p>
<p>Today, the rules <em>also</em> ensure that <code>E</code> and <code>{E}</code>, for any expression <code>E</code>, result in temporaries with the same lifetimes, but this document proposes dropping that equivalence as of Rust 2024.</p>
<h2 id="current-rules">Current rules</h2>
<h3 id="when-are-temporaries-introduced">When are temporaries introduced?</h3>
<p>Temporaries are introduced when there is a borrow of a <em>value-producing expression</em> (often called an &ldquo;rvalue&rdquo;). Consider an example like <code>&amp;foo()</code>; in this case, the compiler needs to produce a reference to some memory somewhere, so it stores the result of <code>foo()</code> into a temporary local variable and returns a reference to that.</p>
<p>Often the borrows are implicit. Consider a function <code>get_data()</code> that returns a <code>Vec&lt;T&gt;</code> and a call <code>get_data().is_empty()</code>; because <code>is_empty()</code> is declared with <code>&amp;self</code> on <code>[T]</code>, this will store the result of <code>get_data()</code> into a temporary, invoke <code>deref</code> to get a <code>&amp;[T]</code>, and then call <code>is_empty</code>.</p>
<h3 id="default-temporary-lifetime">Default temporary lifetime</h3>
<p>Whenever a temporary is introduced, the default rule is that the temporary is dropped at the end of the innermost enclosing statement; this rule is sometimes summarized as &ldquo;at the next semicolon&rdquo;. But the definition of <em>statement</em> involves some subtlety.</p>
<p><strong>Block tail expressions.</strong> Consider a Rust block:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">stmt</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">stmt</span><span class="p">[</span><span class="n">n</span><span class="p">];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">tail_expression</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>And temporaries created in a statement <code>stmt[i]</code> will be dropped once that statement completes. But the tail expression is not considered a statement, so temporaries produced <em>there</em> are dropped at the end of the statement that encloses the block. For example, given <code>get_data</code> and <code>is_empty</code> as defined in the previous section, and a statement <code>let x = foo({get_data().is_empty()});</code>, the vector will be freed at the end of the <code>let</code>.</p>
<p><strong>Conditional scopes for <code>if</code> and <code>while</code>.</strong> <code>if</code> and <code>while</code> expressions and <code>if guards</code> (but not <code>match</code> or <code>if let</code>) introduce a temporary scope around the condition. So any temporaries from <code>expr</code> in <code>if expr { ... }</code> would be dropped before the <code>{ ... }</code> executes. The reasoning here is that all of these contexts produce a boolean and hence it is not possible to have a reference into the temporary that is still live. For example, given <code>if get_data().is_empty()</code>, the vector must be safe to drop before entering the body of the <code>if</code>. This is not true for a case like <code>match get_data().last() { Some(x) =&gt; ..., None =&gt; ... }</code>, where the <code>x</code> would be a reference into the vector returned by <code>get_data()</code>.</p>
<p><strong>Function scope.</strong> The tail expression of a function block (e.g., the expression <code>E</code> in <code>fn foo() { E }</code>) is not contained by <em>any</em> statement. In this case, we drop temporaries from <code>E</code> just before returning from the function, and thus <code>fn last() -&gt; Option&lt;&amp;Datum&gt; { get_data().last() }</code> fails the borrow check (because the temporary returned by <code>get_data()</code> is dropped before the function returns). Importantly, this function scope ends <em>after</em> local variables in the function are dropped. Therefore, this function&hellip;</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">String</span>::<span class="n">new</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="fm">vec!</span><span class="p">[].</span><span class="n">is_empty</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>&hellip;is effectively desugared to this&hellip;</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">tmp</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">String</span>::<span class="n">new</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">{</span><span class="w"> </span><span class="n">tmp</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[];</span><span class="w"> </span><span class="o">&amp;</span><span class="n">tmp</span><span class="w"> </span><span class="p">}.</span><span class="n">is_empty</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w"> </span><span class="c1">// x dropped here
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w"> </span><span class="c1">// tmp dropped here
</span></span></span></code></pre></div><h3 id="lifetime-extension">Lifetime extension</h3>
<p>In some cases, temporary lifetimes are extended from the innermost <em>statement</em> to the innermost <em>block</em>. The rules for this are currently defined <em>syntactically</em>, meaning that they do not consider types or name resolution. The intution is that we extend the lifetime of the temporary for an expression <code>E</code> if it is evident that this temporary will be stored into a local variable. Consider the trivial example:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">t</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">foo</span><span class="p">();</span><span class="w">
</span></span></span></code></pre></div><p>Here, <code>foo()</code> is a value expression, and hence <code>&amp;foo()</code> needs to create a temporary so that we can have a reference. But the resulting <code>&amp;T</code> is going to be stored in the local variable <code>t</code>. If we were to free the temporary at the next <code>;</code>, this local variable would be immediately invalid. That doesn&rsquo;t seem to match the user intent. Therefore, we <em>extend</em> the lifetime of the temporary so that it is dropped at the end of the innermost block. This is the equivalent of:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">tmp</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">t</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">tmp</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">foo</span><span class="p">();</span><span class="w"> </span><span class="o">&amp;</span><span class="n">tmp</span><span class="w"> </span><span class="p">};</span><span class="w">
</span></span></span></code></pre></div><p>We can extend this same logic to compound expressions. Consider:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">t</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="o">&amp;</span><span class="n">foo</span><span class="p">(),</span><span class="w"> </span><span class="o">&amp;</span><span class="n">bar</span><span class="p">());</span><span class="w">
</span></span></span></code></pre></div><p>we will expand this to</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">tmp1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">tmp2</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">t</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">tmp1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">foo</span><span class="p">();</span><span class="w"> </span><span class="n">tmp2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">bar</span><span class="p">();</span><span class="w"> </span><span class="p">(</span><span class="o">&amp;</span><span class="n">tmp1</span><span class="p">,</span><span class="w"> </span><span class="o">&amp;</span><span class="n">tmp2</span><span class="p">)</span><span class="w"> </span><span class="p">};</span><span class="w">
</span></span></span></code></pre></div><p>The exact rules are given by a grammar in the code and also <a href="https://doc.rust-lang.org/nightly/reference/destructors.html#drop-scopes">covered in the reference</a>. Rather than define them here I&rsquo;ll just give some examples. In each case, the <code>&amp;foo()</code> temporary is extended:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">t</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">foo</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// Aggregates containing a reference that is stored into a local:
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">t</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">x</span>: <span class="kp">&amp;</span><span class="nc">foo</span><span class="p">()</span><span class="w"> </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">t</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="o">&amp;</span><span class="n">foo</span><span class="p">(),</span><span class="w"> </span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">t</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">[</span><span class="o">&amp;</span><span class="n">foo</span><span class="p">()];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// Patterns that create a reference, rather than `&amp;`:
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">ref</span><span class="w"> </span><span class="n">t</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">foo</span><span class="p">();</span><span class="w">
</span></span></span></code></pre></div><p>Here are some cases where temporaries are NOT extended:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">f</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">some_function</span><span class="p">(</span><span class="o">&amp;</span><span class="n">foo</span><span class="p">());</span><span class="w"> </span><span class="c1">// could be `fn some_function(x: &amp;Vec&lt;T&gt;) -&gt; bool`, may not need extension
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">SomeTupleStruct</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="n">T</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">f</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">SomeTupleStruct</span><span class="p">(</span><span class="o">&amp;</span><span class="n">foo</span><span class="p">());</span><span class="w"> </span><span class="c1">// looks like a function call
</span></span></span></code></pre></div><h2 id="patterns-that-work-well-in-the-current-rules">Patterns that work well in the current rules</h2>
<h3 id="storing-temporary-into-a-local">Storing temporary into a local</h3>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Data</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span>: <span class="kp">&amp;</span><span class="na">&#39;a</span> <span class="p">[</span><span class="kt">u32</span><span class="p">]</span><span class="w"> </span><span class="c1">// use a slice to permit subslicing later
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">initialize</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">d</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Data</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">x</span>: <span class="kp">&amp;</span><span class="p">[</span><span class="mi">1</span><span class="p">,</span><span class="w"> </span><span class="mi">2</span><span class="p">,</span><span class="w"> </span><span class="mi">3</span><span class="p">]</span><span class="w"> </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//                 ^^^^^^^^^ extended temporary
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">d</span><span class="p">.</span><span class="n">process</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Data</span><span class="o">&lt;</span><span class="nb">&#39;_</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">process</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="bp">self</span><span class="p">.</span><span class="n">data</span><span class="p">[</span><span class="mi">1</span><span class="o">..</span><span class="p">];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="reading-values-out-of-a-lockrefcell">Reading values out of a lock/refcell</h3>
<p>The current rules allow you to do atomic operations on locals/refcells conveniently, so long as they don&rsquo;t return references to the data. This works great in a <code>let</code> statement (there are other cases below where it works less well).</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">result</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">cell</span><span class="p">.</span><span class="n">borrow_mut</span><span class="p">().</span><span class="n">do_something</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// `cell` is not borrowed here
</span></span></span><span class="line"><span class="cl"><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span></code></pre></div><h2 id="error-prone-cases-with-todays-rules">Error-prone cases with today&rsquo;s rules</h2>
<p>Today&rsquo;s rules sometimes give lifetimes that are <strong>too long</strong>, resulting in bugs at runtime.</p>
<h3 id="deadlocks-because-of-temporary-lifetimes-in-matches">Deadlocks because of temporary lifetimes in matches</h3>
<p>One very common problem is deadlocks (or panics, for ref-cell) when mutex locks occur in a match scrutinee:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">match</span><span class="w"> </span><span class="n">lock</span><span class="p">.</span><span class="n">lock</span><span class="p">().</span><span class="n">data</span><span class="p">.</span><span class="n">clone</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//     ------ returns a temporary guard
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Data</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">lock</span><span class="p">.</span><span class="n">lock</span><span class="p">();</span><span class="w"> </span><span class="c1">// deadlock
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w"> </span><span class="c1">// &lt;-- lock() temporary dropped here
</span></span></span></code></pre></div><h2 id="ergonomic-problems-with-todays-rules">Ergonomic problems with today&rsquo;s rules</h2>
<p>Today&rsquo;s rules sometimes give lifetimes that are <strong>too short</strong>, resulting in ergonomic failures or confusing error messages.</p>
<h3 id="call-parameter-temporary-lifetime-is-too-short-rfc66">Call parameter temporary lifetime is too short (RFC66)</h3>
<p>Somewhat surprisingly, the following code <a href="https://play.rust-lang.org/?version=stable&amp;mode=debug&amp;edition=2021&amp;gist=e7dde5d5b85500cd00af06c9b5d5acaf">does not compile</a>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">get_data</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nb">Vec</span><span class="o">&lt;</span><span class="kt">u32</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[</span><span class="mi">1</span><span class="p">,</span><span class="w"> </span><span class="mi">2</span><span class="p">,</span><span class="w"> </span><span class="mi">3</span><span class="p">]</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">last_elem</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">get_data</span><span class="p">().</span><span class="n">last</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nb">drop</span><span class="p">(</span><span class="n">last_elem</span><span class="p">);</span><span class="w"> </span><span class="c1">// just a dummy use
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This fails because the <code>Vec</code> returned by <code>get_data()</code> is stored into a temporary so that we can invoke <code>last</code>, which requires <code>&amp;self</code>, but that temporary is dropped at the <code>;</code> (as this case doesn&rsquo;t fall under the lifetime extension rules).</p>
<p><a href="https://rust-lang.github.io/rfcs/0066-better-temporary-lifetimes.html">RFC 66</a> proposed a rather underspecified extension to the temporary lifetime rules to cover this case; loosely speaking, the idea was to extend the lifetime extension rules to extend the lifetime of temporaries that appear in function arguments if the function&rsquo;s signature is going to return a reference from that argument. So, in this case, the signature of <code>last</code> indicates that it returns a reference from <code>self</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">]</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">last</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;&amp;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="o">..</span><span class="p">.}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>and therefore, since <code>E.last()</code> is being assigned to <code>last_elem</code>, we would extend the lifetime of any temporaries in <code>E</code> (the value for <code>self</code>). Ding Xiang Fei has been exploring how to actually implement <a href="https://rust-lang.github.io/rfcs/0066-better-temporary-lifetimes.html">RFC 66</a> and has made some progress, but it&rsquo;s clear that we need to settle on the <em>exact</em> rules for when lifetime temporary extension should happen.</p>
<p>Even assuming we created some rules for RFC 66, there can be confusing cases that wouldn&rsquo;t be covered. Consider this statement:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">l</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">get_data</span><span class="p">().</span><span class="n">last</span><span class="p">().</span><span class="n">unwrap</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nb">drop</span><span class="p">(</span><span class="n">l</span><span class="p">);</span><span class="w"> </span><span class="c1">// ERROR
</span></span></span></code></pre></div><p>Here, the <code>unwrap</code> call has a signature <code>fn(Option&lt;T&gt;) -&gt; T</code>, which doesn&rsquo;t contain any references. Therefore, it does not extend the lifetimes of temporaries in its arguments. The argument here is the expression <code>get_data().last()</code>, which creates a temporary to store <code>get_data()</code>. This temporary is then dropped at the end of the statement, and hence <code>l</code> winds up pointing to dead memory.</p>
<h3 id="statement-like-expressions-in-tail-position">Statement-like expressions in tail position</h3>
<p>The original rules assumed that changing <code>E</code> to <code>{E}</code> should not change when temporaries are dropped. This has the counterintuitive behavior though that introducing a block doesn&rsquo;t constrain the stack lifetime of temporaries. It is also surprising for blocks that have tail expressions that are &ldquo;statement-like&rdquo; (e.g., <code>match</code>), because these can be used as statements without a <code>;</code>, and thus users may not have a clear picture of whether they are an expression producing a value or a statement.</p>
<p><strong>Example.</strong> The following code <a href="https://play.rust-lang.org/?version=stable&amp;mode=debug&amp;edition=2021&amp;gist=f16625bcd6e35c988c5b9399b821b98b">does not compile</a>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Identity</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span><span class="p">(</span><span class="n">A</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Drop</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Identity</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">drop</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">22</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">match</span><span class="w"> </span><span class="n">Identity</span><span class="p">(</span><span class="o">&amp;</span><span class="n">x</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//------------ creates a temporary that can be matched
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">_</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;&#34;</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w"> </span><span class="c1">// &lt;-- this is considered a trailing expression by the compiler
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w"> </span><span class="c1">// &lt;-- temporary is dropped after this block executes
</span></span></span></code></pre></div><p>Because of the way that the implicit function scope works, and the fact that this match is actually the tail expression in the function body, this is effectively desugared to something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Identity</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span><span class="p">(</span><span class="n">A</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Drop</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Identity</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">drop</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">tmp</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">22</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">match</span><span class="w"> </span><span class="p">{</span><span class="n">tmp</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Identity</span><span class="p">(</span><span class="o">&amp;</span><span class="n">x</span><span class="p">);</span><span class="w"> </span><span class="n">tmp</span><span class="p">}</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">_</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;&#34;</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="lack-of-equivalence-between-if-and-match">Lack of equivalence between if and match</h3>
<p>The current rules distinguish temporary behavior for if/while from match/if-let. As a result, code like this compiles and executes fine:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">if</span><span class="w"> </span><span class="n">lock</span><span class="p">.</span><span class="n">lock</span><span class="p">().</span><span class="n">something</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="c1">// grab lock, then release
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">lock</span><span class="p">.</span><span class="n">lock</span><span class="p">();</span><span class="w"> </span><span class="c1">// OK to grab lock again
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>but very similar code using a match gives a deadlock:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="kc">true</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">lock</span><span class="p">.</span><span class="n">lock</span><span class="p">().</span><span class="n">something</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">lock</span><span class="p">.</span><span class="n">lock</span><span class="p">(),</span><span class="w"> </span><span class="c1">// Deadlock lock.lock(), // Deadlock
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// or
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">match</span><span class="w"> </span><span class="n">lock</span><span class="p">.</span><span class="n">lock</span><span class="p">().</span><span class="n">something</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kc">true</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">lock</span><span class="p">.</span><span class="n">lock</span><span class="p">(),</span><span class="w"> </span><span class="c1">// Deadlock
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kc">false</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">(),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Partly as a result of this lack of equivalence, we have had a lot of trouble doing desugarings for things like let-else and if-let expressions.</p>
<h3 id="named-block">Named block</h3>
<p>Tail expressions aren&rsquo;t the only way to &ldquo;escape&rdquo; a value from a block, the same applies to breaking with a named label, but they don&rsquo;t benefit from lifetime extension. The following example, therefore, <a href="https://play.rust-lang.org/?version=stable&amp;mode=debug&amp;edition=2021&amp;gist=3fe08dd173a4bbf18a4321b36bc74d50">fails to compile</a>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="na">&#39;a</span>: <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">break</span><span class="w"> </span><span class="nl">&#39;a</span><span class="w"> </span><span class="o">&amp;</span><span class="fm">vec!</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span><span class="w"> </span><span class="c1">// ERROR
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nb">drop</span><span class="p">(</span><span class="n">x</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Note that a tail-expression based version <a href="https://play.rust-lang.org/?version=stable&amp;mode=debug&amp;edition=2021&amp;gist=aacec04924b890d994abc2d1db67627b">does compile today</a>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">&amp;</span><span class="fm">vec!</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="w"> </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nb">drop</span><span class="p">(</span><span class="n">x</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="proposed-properties-to-focus-discussion">Proposed properties to focus discussion</h2>
<p>To focus discussion, here are some named examples we can use that capture key patterns.</p>
<p>Examples of behaviors we would ideally <em>preserve</em>:</p>
<ul>
<li><strong>read-locked-field</strong>: <code>let x: Event = ref_cell.borrow_mut().get_event();</code> releases borrow at the end of the <em>statement</em> (as today)</li>
<li><strong>obvious aggregate construction</strong>: <code>let x: Event = Event { x: &amp;[1, 2, 3] }</code> stores <code>[1, 2, 3]</code> in a temporary with block scope</li>
</ul>
<p>Examples of behavior that we would like, but which we don&rsquo;t have today, resulting in bugs/confusion:</p>
<ul>
<li><strong>match-locked-field</strong>: <code>match data.lock().unwrap().data { ... }</code> releases lock before match body executes</li>
<li><strong>if-match-correspondence</strong>: <code>if &lt;expr&gt; {}</code>, <code>if let true = &lt;expr&gt; {}</code>, and <code>match &lt;expr&gt; { true =&gt; .. }</code> all behave the same with respect to temporaries in <code>&lt;expr&gt;</code> (unlike today)</li>
<li><strong>block containment</strong>: <code>{&lt;expr&gt;}</code> must not create any temporaries that extend past the end of the block (unlike today)</li>
<li><strong>tail-break-correspondence</strong>: <code>{&lt;expr&gt;}</code> and <code>'a: { break 'a &lt;expr&gt; }</code> should be equivalent</li>
</ul>
<p>Examples we behavior that we would like, but which we don&rsquo;t have today, resulting in ergonomic pain (these cases may not be achievable without violating the previous ones):</p>
<ul>
<li><strong>last</strong>: <code>let x = get_data().last();</code> (the canonical RFC66 example) will extend lifetime of data to end of block; also covers (some) <code>new</code> methods like <code>let x: Event&lt;'_&gt; = Event::new(&amp;[1, 2, 3])</code></li>
<li><strong>last-unwrap</strong>: <code>let x = get_data().last().unwrap();</code> (extended form of the above) will extend lifetime of data to end of block</li>
<li><strong>tuple struct construction</strong>: <code>let x = Event(&amp;[1, 2, 3])</code></li>
</ul>
<h2 id="tightest-proposal">Tightest proposal</h2>
<p>The proposal with minimal confusion would be to remove syntactic lifetime extension and tighten default lifetimes in two ways:</p>
<p><em>Tighten block tail expressions.</em> Have temporaries in the tail expression of a block be dropped when returning from the block. This ensures <em>block containment</em> and <em>tail-break-correspondence</em>.</p>
<p><em>Tighten match scrutinees.</em> Drop temporaries from match/if-let scrutinees performing the match. This ensures <em>match-locked-field</em> and <em>if-match-correspondence.</em> To avoid footguns, we can tighten up the rules around match/if-let scrutinees so that temporaries are dropped before entering body of the match.</p>
<p>In short, temporaries would always be dropped at the innermost statement, match/if/if-let/while scrutinee, or block.</p>
<h3 id="things-that-no-longer-build">Things that no longer build</h3>
<p>There are three cases that build today which will no longer build with this minimal proposal:</p>
<ul>
<li><code>let x = &amp;vec![]</code> no longer builds, nor does <code>let x = Foo { x: &amp;[1, 2, 3] }</code>. Both of them create temporaries that are dropped at the end of the let.</li>
<li><code>match &amp;foo.borrow_mut().parent { Some(ref p) =&gt; .., None =&gt; ... }</code> no longer builds, since temporary from <code>borrow_mut()</code> is dropped before entering the match arms.</li>
<li><code>{let x = {&amp;vec![0]}; ...}</code> no longer builds, as a result of tightening block tail expressions. Note however that other examples, e.g. the one from th section <a href="#Statement-like-expressions-in-tail-position">&ldquo;statement-like expressions in tail position&rdquo;</a>, would now build successfully.</li>
</ul>
<p>The core proposal also does nothing to address RFC66-like patterns, tuple struct construction, etc.</p>
<h3 id="extension-option-a-do-what-i-mean">Extension option A: Do What I Mean</h3>
<p>One way to overcome the concerns of the core proposal would be to extend with more &ldquo;DWIM&rdquo;-like options. For example, we could extend &ldquo;lifetime extension rules&rdquo; to cover match expressions.</p>
<p><em>Lifetime extension for <code>let</code> statements, as today</em>. To allow <code>let x = &amp;vec![]</code> to build, we can restore today&rsquo;s lifetime extension rules.</p>
<ul>
<li>Pro: things like this will build</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="p">{</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span>: <span class="kp">&amp;</span><span class="nc">get_data</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//     ---------- stored in a temporary that outlives `x`
</span></span></span><span class="line"><span class="cl"><span class="p">};)</span><span class="w">
</span></span></span></code></pre></div><ul>
<li>Con: the following example would build again, which leads to a (perhaps surprising) <a href="https://play.rust-lang.org/?version=stable&amp;mode=debug&amp;edition=2021&amp;gist=2e9e4077c3558c3da5d9df9ba717c8af">panic</a> &ndash; that said, I&rsquo;ve never seen a case like this in the wild, the confusion <em>always</em> occurs with match</li>
</ul>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">use</span><span class="w"> </span><span class="n">std</span>::<span class="n">cell</span>::<span class="n">RefCell</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Foo</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span>: <span class="kp">&amp;</span><span class="na">&#39;a</span> <span class="kt">u32</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">cell</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">RefCell</span>::<span class="n">new</span><span class="p">(</span><span class="mi">22</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">x</span>: <span class="nc">Foo</span><span class="o">&lt;</span><span class="nb">&#39;_</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">data</span>: <span class="kp">&amp;</span><span class="o">*</span><span class="n">cell</span><span class="p">.</span><span class="n">borrow_mut</span><span class="p">(),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">*</span><span class="n">cell</span><span class="p">.</span><span class="n">borrow_mut</span><span class="p">()</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w"> </span><span class="c1">// &lt;-- panic
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nb">drop</span><span class="p">(</span><span class="n">x</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p><em>Scope extension for match structinees</em>. To allow <code>match &amp;foo.borrow_mut().parent { Some(ref x =&gt; ... }</code> to work, we could fix this by including similar scope extension rules to the ones used with <code>let</code> initializers (i.e., if we can see that a ref is taken into the temporary, then extend its lifetime, but otherwise do not).</p>
<ul>
<li>Pro: <code>match &amp;foo.borrow_mut().parent { .. }</code> works as it does today.</li>
<li>Con: Syntactic extension rules can be approximate, so e.g. <code>match (foo(), bar().baz()) { (Some(ref x), y) =&gt; .. }</code> would likely keep the temporary returned by <code>bar()</code>, even though it is not referenced.</li>
</ul>
<p><em>RFC66-like rules.</em> Use some heuristic rules to determine, from a function signature, when the return type includes data from the arguments. If the return type of a function <code>f</code> references a generic type or lifetime parameter that also appears in some argument <code>i</code>, and the function call <code>f(a0, ..., ai, ..., an)</code> appears in some position with an extended temporary lifetime, then <code>ai</code> will also have an extended temporary lifetime (i.e., any temporaries created in <code>ai</code> will persist until end of enclosing block / match expression).</p>
<ul>
<li>Pro: Patterns like <code>let x = E</code> where <code>E</code> is <code>get_data().last()</code>, <code>get_data().last().unwrap()</code>, <code>TupleStruct(&amp;get_data())</code>, or <code>SomeStruct::new(&amp;get_data())</code> would all allocate a temporary for <code>get_data()</code> that persistent until the end of the enclosing block. This occurs because</li>
<li>Con: Complex rules imply that <code>let x = locked_vec.lock().last()</code> would also extend lock lifetime to end-of-block, which users may not expect.</li>
</ul>
<h3 id="extension-option-b-anonymous-lets-for-extended-temporary-lifetimes">Extension option B: &ldquo;Anonymous lets&rdquo; for extended temporary lifetimes</h3>
<p>Allow <code>expr.let</code> as an operator that means &ldquo;introduce a let to store this value inside the innermost block but before the current statement and replace this statement with a reference to it&rdquo;. So for example:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">get_data</span><span class="p">().</span><span class="kd">let</span><span class="p">.</span><span class="n">last</span><span class="p">();</span><span class="w">
</span></span></span></code></pre></div><p>would be equivalent to</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">tmp</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">get_data</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">tmp</span><span class="p">.</span><span class="n">last</span><span class="p">();</span><span class="w">
</span></span></span></code></pre></div><p><em>Question:</em> Do we keep some amount of implicit extension? For example, should <code>let x = &amp;vec![]</code> keep compiling, or do you have to do <code>let x = &amp;vec![].let</code>?</p>
<h2 id="parting-notes">Parting notes</h2>
<p><em>Editor&rsquo;s note:</em> As I wrote at the start, this was an early document to prompt discussion in a meeting (<a href="https://github.com/rust-lang/lang-team/blob/master/design-meeting-minutes/2023-03-15-temporary-lifetimes.md#discussion-comments">you can see notes from the meeting here</a>) It&rsquo;s not a full proposal. That said, my position when I started writing was different than where I landed. Initially I was going to propose more of a &ldquo;DWIM&rdquo;-approach, tweaking the rules to be tighter in some places, more flexible in others. I&rsquo;m still interested in exploring that, but I am worried that the end-result will just be people having very little idea when their destructors run. For the most part, you shouldn&rsquo;t have to care about that, but it is sometimes quite important. That leads me to: let&rsquo;s have some simple rules that can be explained on a postcard and work &ldquo;pretty well&rdquo;, and some convenient way to extend lifetimes when you want it. The <code>.let</code> syntax is interesting but ultimately probably too confusing to play this role.</p>
<p>Oh, and a note on the edition: I didn&rsquo;t say it explicitly, but we can make changes to temporary lifetime rules over an edition by rewriting where necessary to use explicit lets, or (if we add one) some other explicit notation. The result would be code that runs on all editions with same semantics.</p>
]]></content></entry><entry><title type="html">To async trait or just to trait</title><link href="https://smallcultfollowing.com/babysteps/blog/2023/03/12/to-async-trait-or-just-to-trait/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2023/03/12/to-async-trait-or-just-to-trait/</id><published>2023-03-12T00:00:00+00:00</published><updated>2023-03-12T19:33:00-04:00</updated><content type="html"><![CDATA[<p>One interesting question about async fn in traits is whether or not we should label the <em>trait itself</em> as async. Until recently, I didn’t see any need for that. But as we discussed the question of how to enable “maybe async” code, we realized that there would be some advantages to distinguishing “async traits” (which could contain async functions) from sync traits (which could not). However, as I’ve thought about the idea more, I’m more and more of the mind that we should not take this step — at least not now. I wanted to write a blog post divin g into the considerations as I see them now.</p>
<h2 id="what-is-being-proposed">What is being proposed?</h2>
<p>The specific proposal I am discussing is to require that traits which include async functions are declared as async traits…</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// The &#34;async trait&#34; (vs just &#34;trait&#34;) would be required
</span></span></span><span class="line"><span class="cl"><span class="c1">// to have an &#34;async fn&#34; (vs just a &#34;fn&#34;).
</span></span></span><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">trait</span><span class="w"> </span><span class="n">HttpEngine</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">fetch</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">url</span>: <span class="nc">Url</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Vec</span><span class="o">&lt;</span><span class="kt">u8</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>…and when you reference them, you use the <code>async</code> keyword as well…</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">load_data</span><span class="o">&lt;</span><span class="n">H</span><span class="o">&gt;</span><span class="p">(</span><span class="n">h</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="k">impl</span><span class="w"> </span><span class="k">async</span><span class="w"> </span><span class="n">HttpEngine</span><span class="p">,</span><span class="w"> </span><span class="n">urls</span>: <span class="kp">&amp;</span><span class="p">[</span><span class="n">Url</span><span class="p">])</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//                       ----- just writing `impl HttpEngine`
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//                             would be an error
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="err">…</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This would be a change from the support implemented in nightly today, where any trait can have async functions.</p>
<h2 id="why-have-async-traits-vs-normal-traits">Why have “async traits” vs “normal” traits?</h2>
<p>When authoring an async application, you’re going to define traits like <code>HttpEngine</code> that inherently involve async operations. In that case, having to write <code>async trait</code> seems like pure overhead. So why would we ever want it?</p>
<p>The answer is that not all traits are like <code>HttpEngine</code>. We can call <code>HttpEngine</code> an “always async” trait — it will always involve an async operation. <strong>But a lot of traits are “maybe async” — they sometimes involve async operations and sometimes not.</strong> In fact, we can probably break these down further: you have traits like <code>Read</code>, which involve I/O but have a sync and async equivalent, and then you have traits like <code>Iterator</code>, which are orthogonal from I/O.</p>
<p>Particularly for traits like <code>Iterator</code>, the current trajectory will result in two nearly identical traits in the stdlib: <code>Iterator</code> and <code>AsyncIterator</code>. These will be mostly the same apart from <code>AsyncIterator</code> have an async <code>next</code> function, and perhaps some more combinators. It’s not the end of the world, but it’s also not ideal, particularly when you consider that we likely want more “modes”, like a <code>const</code> Iterator, a “sendable” iterator, perhaps a fallible iterator (one that returns results), etc. This is of course the problem often referred to as the “color problem”, from Bob Nystron&rsquo;s well-known <a href="https://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/">“What color is your function?”</a> blog post, and it’s precisely what the <a href="https://blog.rust-lang.org/inside-rust/2022/07/27/keyword-generics.html">“keyword generics” initiative</a> is looking to solve.</p>
<h2 id="requiring-an-async-keyword-ensures-consistency-between-maybe-and-always-async-traits">Requiring an async keyword ensures consistency between “maybe” and “always” async traits…</h2>
<p>It’s not really clear what a full solution to the “color problem” looks like. But whatever it is, it’s going to involve having traits with multiple modes. So instead of <code>Iterator</code> and <code>AsyncIterator</code>, we’ll have the base definition of <code>Iterator</code> and then a way to derive an async version, <code>async Iterator</code>. We can then call an <code>Iterator</code> a “maybe async” trait, because it might be sync but it might be async. We might declare a “maybe async” trait using an attribute, like this<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cp">#[maybe(async)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Because of the #[maybe(async)] attribute,
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// the async keyword on this function means “if
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// this trait is in async mode, then this is an
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// async function”:
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now imagine I have a function that reads urls from some kind of input stream. This might be an <code>async fn</code> that takes an <code>impl async Iterator</code> as argument:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">read_urls</span><span class="p">(</span><span class="n">urls</span>: <span class="nc">impl</span><span class="w"> </span><span class="k">async</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Url</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//                        --——- specify async mode
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">while</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">u</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">urls</span><span class="p">.</span><span class="n">next</span><span class="p">().</span><span class="k">await</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//                          -———- needed because this is an async iterator
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="err">…</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>But now let’s say I want to combine this (async) iterator of urls and use an <code>HttpEngine</code> (our “always async” trait) to fetch them:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">fetch_urls</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">urls</span>: <span class="nc">impl</span><span class="w"> </span><span class="k">async</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Url</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">engine</span>: <span class="nc">impl</span><span class="w"> </span><span class="n">HttpEngine</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">   </span><span class="k">while</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">u</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">urls</span><span class="p">.</span><span class="n">next</span><span class="p">().</span><span class="k">await</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">       </span><span class="kd">let</span><span class="w"> </span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">engine</span><span class="p">.</span><span class="n">fetch</span><span class="p">(</span><span class="n">u</span><span class="p">).</span><span class="k">await</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">       </span><span class="err">…</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">   </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>There’s nothing wrong with this code, but it might be a bit surprising that I have to write <code>impl async Iterator</code> but I just write <code>impl HttpEngine</code>, even though both traits involve async functions. I can imagine that it would sometimes be hard to remember which traits are “always async” versus which ones are only “maybe async”.</p>
<h2 id="which-also-means-traits-can-go-from-always-to-maybe-async-without-a-major-version-bump">…which also means traits can go from “always” to “maybe” async without a major version bump.</h2>
<p>There is another tricky bit: imagine that I am authoring a library and I create a “always async” <code>HttpEngine</code> trait to start:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">HttpEngine</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">fetch</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">url</span>: <span class="nc">Url</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Vec</span><span class="o">&lt;</span><span class="kt">u8</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>but then later I want to issue a new version that offers a sync <em>and</em> an async version of <code>HttpEngine</code>. I can’t add a <code>#[maybe(async)]</code> to the trait declaration because, if I do so, then code using <code>impl HttpEngine</code> would suddenly be getting the <em>sync</em> version of the trait, whereas before they were getting the <em>async</em> version.</p>
<p>In other words, unless we force people to declare async traits up front, then changing a trait from “always async” to “maybe async” is a breaking change.</p>
<h2 id="but-writing-async-trait-for-traits-that-are-always-async-is-annoying">But writing <code>async Trait</code> for traits that are <em>always</em> async is annoying…</h2>
<p>The points above are solid. But there are some flaws. The most obvious is that having to write <code>async</code> for every trait that uses an async function is likely to be pretty tedious. I can easily imagine that people writing async applications are going to use a lot of “always async” traits and I imagine that, each time they write <code>impl async HttpEngine</code>, they will think to themselves, “How many times do I have to tell the compiler this is async already?! We get it, we get it!!”</p>
<p>Put another way, the consistency argument (“how will I remember which traits need to be declared async?”) may not hold water in practice. I can imagine that for many applications the only “maybe async” traits are the core abstractions coming from libraries, like <code>Iterator</code>, and most of the other code is just “always async”. So actually it’s not that hard to remember which is which.</p>
<h2 id="and-its-not-clear-that-traits-will-go-from-always-to-maybe-async-anyway">…and it’s not clear that traits will go from “always” to “maybe” async anyway…</h2>
<p>But what about semver violations? Well, if my thesis above is correct, then it’s also true that there will be relatively few traits that need to go from “always async” to “maybe async”. Moreover, I imagine most libraries will know up front whether they expect to be sync or not. So maybe it’s not a big deal that this is a breaking change,</p>
<h2 id="and-trait-aliases-would-give-a-workaround-for-always---maybe-transitions-anyway">…and trait aliases would give a workaround for “always -&gt; maybe” transitions anyway…</h2>
<p>So, maybe it won’t happen in practice, but let’s imagine that we did define an always async <code>HttpEngine</code> and then later want to make the trait “maybe async”. Do we absolutely need a new major version of the crate? Not really, there is a workaround. We can define a new “maybe async” trait — let’s call it <code>HttpFetch</code> and then redefine <code>HttpEngine</code> in terms of <code>HttpFetch</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// This is a trait alias. It’s an unstable feature that I would like to stabilize.
</span></span></span><span class="line"><span class="cl"><span class="c1">// Even without a trait alias, though, you could do this with a blanket impl.
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">HttpEngine</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">async</span><span class="w"> </span><span class="n">HttpFetch</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="cp">#[maybe(async)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">HttpFetch</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="err">…</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This obviously isn’t ideal: you wind up with two names for the same underlying trait. Maybe you deprecate the old one. But it’s not the end of the world.</p>
<h2 id="and-requiring-async-composes-poorly-with-supertraits-and-trait-aliases">…and requiring async composes poorly with supertraits and trait aliases…</h2>
<p>Actually, that last example brings up an interesting point. To truly ensure consistency, it’s not enough to say that “traits with async functions must be declared async”. We also need to be careful what we permit in trait aliases and supertraits. For example, imagine we have a trait <code>UrlIterator</code> that has an <code>async Iterator</code> as a supertrait…</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">UrlIterator</span>: <span class="nc">async</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Url</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>…now people could write functions that take a <code>impl UrlIterator</code>, but it will still require <code>await</code> when you invoke its methods. So we didn’t really achieve <em>consistency</em> after all. The same thing would apply with a trait alias like <code>trait UrlIterator = async Iterator&lt;Item = Url&gt;</code>.</p>
<p>It’s possible to imagine a requirement like “to have a supertrait that is async, the trait must be async”, but — to me — that feels non-compositional. I’d like to be able to declare a trait alias <code>trait A = …</code> and have the <code>…</code> be able to be any sort of trait bounds, whether they’re async or not. It feels funny to have the async propagate out of the <code>...</code> and onto the trait alias <code>A</code>.</p>
<h2 id="and-while-this-decision-is-hard-to-reverse-it-can-be-reversed">…and, while this decision is hard to reverse, it can be reversed.</h2>
<p>So, let’s say that we were to stabilize the ability to add async functions to any trait. And then later we find that we actually want to have maybe async traits and that we wish we had required people to write <code>async</code> explicitly all the time, because consistency and semver. Are we stuck?</p>
<p>Well, not really. There are options here. For example, we might might make it <em>possible</em> to write <code>async</code> (but not required) and then lint and warn when people don’t. Perhaps in another edition, we would make it mandatory. This is basically what we did with the <code>dyn</code> keyword. Then we could declare that making a trait always-async to maybe-async is not considered worthy of a major version, because people’s code that follows the lints and warnings will not be affected. If we had transitioned so that all code in the new edition required an <code>async</code> keyword even for “always async” traits, we could let people declare a trait to be “maybe async but only in the new edition”, which would avoid all breakage entirely.</p>
<p>In any case, I don’t really want to do those things. It’d be embarassing and confusing to stabilize SAFIT and then decide that “oh, no, you have to declare traits to be async”. I’d rather we just think through the arguments now and make a call. But it’s always good to know that, just in case you’re wrong, you have options.</p>
<h2 id="my-current-conclusion-yagni">My (current) conclusion: YAGNI</h2>
<p>So which way to go? I think the question hinges a lot on how common we expect “maybe async” code to be. My expectation is that, even if we do support it, “maybe async” will be fairly limited. It will mostly apply to (a) code like <code>Iterator</code> that is orthogonal from I/O and (b) core I/O primitives like the <code>Read</code> trait or the <code>File</code> type. If we’re especially successful, then crates like <code>reqwest</code> (which currently offers both a sync and async interface) would be able to unify those into one. But application code I expect to largely be written to be either sync or async.</p>
<p>I also think that it’ll be relatively unusual to go from “always async” to “maybe async”. Not impossible, but unusual <em>enough</em> that either making a new major version or using the “renaming” trick will be fine.</p>
<p><strong>For this reason, I lean towards NOT requiring <code>async trait</code>, and instead allowing <code>async fn</code> to be added to any trait.</strong> I am still hopeful we’ll add “maybe async” traits as well, but I think there won’t be a big problem of “always async” traits needing to change to maybe async. (Clearly we are going to want to go from “never async” to “maybe async”, since there are lots of traits like <code>Iterator</code> in the stdlib, but that’s a non-issue.)</p>
<p>The other argument in favor is that it’s closer to what we do today. There are lots of people using <code>#[async_trait]</code> and I’ve never heard anyone say “it’s so weird that you can write <code>T: HttpEngine</code> and don’t have to write <code>T: async HttpEngine</code>”. <strong>At minimum, if we were going to change to requiring the “async” keyword, I would want to give that change some time to bake on nightly before we stabilized it. This could well delay stabilization significantly.</strong></p>
<p>If, in contrast, you believed that lots of code was going to be “maybe async”, then I think you would probably want the async keyword to be mandatory on traits. After all, since most traits are maybe async anyway, you’re going to need to write it a lot of the time.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>I can feel you fixating on the <code>#[maybe(async)]</code> syntax. Resist the urge! There is no concrete proposal yet.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/dyn-async-traits" term="dyn-async-traits" label="Dyn async traits"/></entry><entry><title type="html">Trait transformers (send bounds, part 3)</title><link href="https://smallcultfollowing.com/babysteps/blog/2023/03/03/trait-transformers-send-bounds-part-3/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2023/03/03/trait-transformers-send-bounds-part-3/</id><published>2023-03-03T00:00:00+00:00</published><updated>2023-03-03T09:39:00-05:00</updated><content type="html"><![CDATA[<p>I previously introduced <a href="https://smallcultfollowing.com/babysteps/
/blog/2023/02/01/async-trait-send-bounds-part-1-intro/">the &ldquo;send bound&rdquo; problem</a>, which refers to the need to add a <code>Send</code> bound to the future returned by an async function. This post continues my tour over the various solutions that are available. This post covers &ldquo;Trait Transformers&rdquo;. This proposal arose from a joint conversation with myself, Eric Holk, Yoshua Wuyts, Oli Scherer, and Tyler Mandry. It&rsquo;s a variant of Eric Holk&rsquo;s <a href="https://blog.theincredibleholk.org/blog/2023/02/13/inferred-async-send-bounds/">inferred async send bounds</a> proposal as well as the work that Yosh/Oli have been doing in the <a href="https://blog.rust-lang.org/inside-rust/2023/02/23/keyword-generics-progress-report-feb-2023.html">keyword generics</a> group. Those posts are worth reading as well, lots of good ideas there.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<h2 id="core-idea-the-trait-transformer">Core idea: the trait transformer</h2>
<p>A <em>transformer</em> is a way for a single trait definition to define multiple variants of that trait. For example, where <code>T: Iterator</code> means that <code>T</code> implements the <code>Iterator</code> trait we know and love, <code>T: async Iterator</code> means that <code>T</code> implements the <em>async version</em> of <code>Iterator</code>. Similarly, <code>T: Send Iterator</code> means that <code>T</code> implements the <em>sendable version</em> of <code>Iterator</code> (we&rsquo;ll define both the &ldquo;sendable version&rdquo; and &ldquo;async version&rdquo; more precisely, don&rsquo;t worry).</p>
<p>Transformers can be combined, so you can write <code>T: async Send Iterator</code> to mean &ldquo;the async, sendable version&rdquo;. They can also be distributed, so you can write <code>T: async Send (Iterator + Factory)</code> to mean the &ldquo;async, sendable&rdquo; version of both <code>Iterator</code> and <code>Factory</code>.</p>
<p>There are 3 proposed transformers:</p>
<ul>
<li>async</li>
<li>const</li>
<li>any auto trait</li>
</ul>
<p>The set of transformers is defined by the language and is not user extensible. This could change in the future, as transformers can be seen as a kind of trait alias.</p>
<h2 id="the-async-transformer">The async transformer</h2>
<p>The async transformer is used to choose whether functions are sync or async. It can only be applied to traits that opt-in by specifying which methods should be made into sync or async. Traits can opt-in either by declaring the async transformer to be mandatory, as follows&hellip;</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">trait</span><span class="w"> </span><span class="n">Fetch</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">fetch</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">url</span>: <span class="nc">Url</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Data</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>&hellip;or by making it optional, in which case we call it a &ldquo;maybe-async&rdquo; trait&hellip;</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cp">#[maybe(async)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="cp">#[maybe(async)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">size_hint</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="p">(</span><span class="kt">usize</span><span class="p">,</span><span class="w"> </span><span class="kt">usize</span><span class="p">)</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Here, the trait <code>Iterator</code> is the same <code>Iterator</code> we&rsquo;ve always had, but <code>async Iterator</code> refers to the &ldquo;async version&rdquo; of <code>Iterator</code>, which means that it has an async <code>next</code> method (but still has a sync method <code>size_hint</code>).</p>
<p>(For the time being, maybe-async traits cannot have default methods, which avoids the need to deal with &ldquo;maybe-async&rdquo; code. This can change in the future.)</p>
<h3 id="trait-transformer-as-macros">Trait transformer as macros</h3>
<p>You can think of a trait transformer as being like a fancy kind of macro. When you write a maybe-async trait like <code>Iterator</code> above, you are effectively defining a <em>template</em> from which the compiler can derive a family of traits. You could think of the <code>#[maybe(async)]</code> annotation as a macro that derives two related traits, so that&hellip;</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cp">#[maybe(async)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="cp">#[maybe(async)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">size_hint</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="p">(</span><span class="kt">usize</span><span class="p">,</span><span class="w"> </span><span class="kt">usize</span><span class="p">)</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>&hellip;would effectively expand into two traits, one with a sync <code>next</code> method and one with an <code>async</code> version&hellip;</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="w"> </span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Item</span><span class="p">;</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">AsyncIterator</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Item</span><span class="p">;</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>&hellip;when you have a where-clause like <code>T: async Iterator</code>, then, the compiler would be transforming that to <code>T: AsyncIterator</code>. In fact, Oli and Yosh implemented a procedural macro crate that does more-or-less <em>exactly</em> this.</p>
<p>The idea with trait transformers though is not to literally do expansions like the ones above, but rather to build those mechanisms into the compiler. This makes them more efficient, and also paves the way for us to have code that is generic over whether or not it is async, or expand the list of modifiers. But the &ldquo;macro view&rdquo; is useful to have in mind.</p>
<h3 id="always-async-traits">Always async traits</h3>
<p>When a trait is declared like <code>async trait Fetch</code>, it only defines an async version, and it is an error to request the sync version like <code>T: Fetch</code>, you must write <code>T: async Fetch</code>.</p>
<p>Defining an async method without being always-async or maybe-async is disallowed:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Fetch</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">fetch</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">url</span>: <span class="nc">Url</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Data</span><span class="p">;</span><span class="w"> </span><span class="c1">// ERROR
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Forbidding traits of this kind means that traits can move from &ldquo;always async&rdquo; to &ldquo;maybe async&rdquo; without a breaking change. See the frequently asked questions for more details.</p>
<h2 id="the-const-transformer">The const transformer</h2>
<p>The const transformer works similarly to <code>async</code>. One can write</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cp">#[maybe(const)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Compute</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="cp">#[maybe(const)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">a</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">b</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>and then if you write <code>T: const Compute</code> it means that <code>a</code> must be a <code>const fn</code> but <code>b</code> need not be. Similarly one could write <code>const trait Compute</code> to indicate that the <code>const</code> transformer is mandatory.</p>
<h2 id="the-auto-trait-transformer">The auto-trait transformer</h2>
<p>Auto-traits can be used as a transformer. This is permitted on any (maybe) async trait or on traits that explicitly opt-in by defining <code>#[maybe(Send)]</code> variants. The default behavior of <code>T: Send Foo</code> for some trait <code>Foo</code> is that&hellip;</p>
<ul>
<li><code>T</code> must be <code>Send</code></li>
<li>the future returned by any async method in <code>Foo</code> must be <code>Send</code></li>
<li>the value returned by any RPITIT method must be <code>Send</code><sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></li>
</ul>
<p>Per these rules, given:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cp">#[maybe(async)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="cp">#[maybe(async)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>writing <code>T: async Send Iterator</code> would be equivalent to:</p>
<ul>
<li><code>T: async Iterator&lt;next(): Send&gt; + Send</code></li>
</ul>
<p>using the <a href="https://smallcultfollowing.com/babysteps/blog/2023/02/13/return-type-notation-send-bounds-part-2/">return type notation</a>.</p>
<p>The <code>#[maybe(Send)]</code> annotation can be applied to associated types or functions&hellip;</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cp">#[maybe(Send)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="nb">IntoIterator</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="cp">#[maybe(Send)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">IntoIter</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>&hellip;in which case writing <code>T: Send IntoIterator</code> would expand to <code>T: IntoIterator&lt;IntoIter: Send&gt; + Send</code>.</p>
<h2 id="frequently-asked-questions">Frequently asked questions</h2>
<h3 id="how-is-this-different-from-eholks-inferred-async-send-bounds">How is this different from eholk&rsquo;s <a href="https://blog.theincredibleholk.org/blog/2023/02/13/inferred-async-send-bounds/">Inferred Async Send Bounds</a>?</h3>
<p>Eric&rsquo;s proposal was similar in that it permitted <code>T: async(Send) Foo</code> as a similar sort of &ldquo;macro&rdquo; to get a bound that included <code>Send</code> bounds on the resulting futures. In that proposal, though the &ldquo;send bounds&rdquo; were tied to the use of async sugar, which means that you could no longer consider <code>async fn</code> to be sugar for a function returning an <code>-&gt; impl Future</code>. That seemed like a bad thing, particularly since explicitly <code>-&gt; impl Future</code> syntax is the only way to write an async fn that doesn&rsquo;t capture all of its arguments.</p>
<h3 id="how-is-this-different-from-the-keyword-generics-post">How is this different from the <a href="https://blog.rust-lang.org/inside-rust/2023/02/23/keyword-generics-progress-report-feb-2023.html">keyword generics</a> post?</h3>
<p>Yosh and Oli posted a <a href="https://blog.rust-lang.org/inside-rust/2023/02/23/keyword-generics-progress-report-feb-2023.html">keyword generics update</a> that included notation for &ldquo;maybe async&rdquo; traits (they wrote <code>?async</code>) along with some other things. The ideas in this post are very similar to those, the main difference is treating <code>Send</code> as an independent transformer, similar to the previous question.</p>
<h3 id="should-the-auto-trait-transformer-be-specific-to-each-auto-trait-or-generic">Should the auto-trait transformer be specific to each auto-trait, or generic?</h3>
<p>As written, the auto-trait transformer is specific to a particular auto-trait, but it might be useful to be able to be generic over multiple (e.g., if you are maybe Send, you likely want to be maybe Send-Sync too, right?). You could imagine writing <code>#[maybe(auto)]</code> instead of <code>#[maybe(Send)]</code>, but that&rsquo;s kind of confusing, because an &ldquo;always-auto&rdquo; trait (i.e., an auto trait like Send) is quite a different thing from a &ldquo;maybe-auto&rdquo; trait (i.e., a trait that has a &ldquo;sendable version&rdquo;). OTOH users can&rsquo;t define their own auto traits and likely will never be able to. Unclear.</p>
<h3 id="why-make-auto-trait-transformer-be-opt-in">Why make auto-trait transformer be opt-in?</h3>
<p>You can imagine letting <code>T: Send Foo</code> mean <code>T: Foo + Send</code> for all traits <code>Foo</code>, without requiring <code>Foo</code> to be declared as <code>maybe(Send)</code>. The problem is that this would mean that customizing the <code>Send</code> version of a trait for the first time is a semver breaking change, and so must be done at the same time the trait is introduced. This implies that no existing trait in the ecosystem could customize its <code>Send</code> version. Seems bad.</p>
<h3 id="will-you-permit-async-methods-without-the-async-transformer-why-or-why-not">Will you permit <code>async</code> methods without the async transformer? Why or why not?</h3>
<p>No. The following trait&hellip;</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Http</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">fetch</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">);</span><span class="w"> </span><span class="c1">// ERROR
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>&hellip;would get an error like &ldquo;cannot use <code>async</code> in a trait unless it is declared as <code>async</code> or <code>#[maybe(async)]</code>. Ensuring that people write <code>T: async Http</code> and not just <code>T: Http</code> means that the trait can become &ldquo;maybe async&rdquo; later without breaking those clients. It also means that people would have to remember (when writing async code) whether a trait is &ldquo;maybe async&rdquo; or &ldquo;always async&rdquo; so they know whether to write <code>T: async Http</code> (for maybe-async traits) or <code>T: Http</code> (for always-async). This way, if the trait has async methods, you write <code>async</code>.</p>
<h3 id="why-did-you-label-methods-in-a-maybeasync-trait-as-maybeasync-instead-of-async">Why did you label methods in a <code>#[maybe(async)]</code> trait as <code>#[maybe(async)]</code> instead of <code>async</code>?</h3>
<p>In the examples, I wrote maybe(async) traits like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cp">#[maybe(async)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="cp">#[maybe(async)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Personally, I rather prefer the idea that inside a <code>#[maybe(async)]</code> block, you define the trait as it were <em>always</em> async&hellip;</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cp">#[maybe(async)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>&hellip;but then the async gets removed when used in a sync context. However, I changed it because I couldn&rsquo;t figure out the right way to permit <code>#[maybe(Send)]</code> in this scenario. I can also imagine that it&rsquo;s a bit confusing to write <code>async fn</code> when you maybe &ldquo;maybe async&rdquo;.</p>
<h3 id="why-use-an-annotation--like-maybeasync-instead-of-a-keyword">Why use an annotation (<code>#[..]</code>) like <code>#[maybe(async)]</code> instead of a keyword?</h3>
<p>I don&rsquo;t know, because <code>?async</code> is hard to read, and we&rsquo;ve got enough keywords? I&rsquo;m open to bikeshedding here.</p>
<h3 id="do-we-still-want-return-type-notation">Do we still want <a href="https://smallcultfollowing.com/babysteps/blog/2023/02/13/return-type-notation-send-bounds-part-2/">return type notation</a>?</h3>
<p>Yes, <a href="https://smallcultfollowing.com/babysteps/blog/2023/02/13/return-type-notation-send-bounds-part-2/">RTN</a> is useful for giving more precise specification of which methods should return send-futures (you may not want to require that <em>all</em> async methods are send, for example). It&rsquo;s also needed internally by the compiler anyway as the &ldquo;desugaring target&rdquo; for the <code>Send</code> transformer.</p>
<h3 id="can-we-allow-maybe-on-typesfunctions">Can we allow <code>#[maybe]</code> on types/functions?</h3>
<p>Maybe!<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup> That&rsquo;s basically full-on keyword generics. This proposal is meant as a stepping stone. It doesn&rsquo;t permit code or types to be generic whether they are async/send/whatever, but it does permit us to define multiple versions of trait. To the language, it&rsquo;s effectively a kind of <em>macro</em>, so that (i.e.) a single trait definition <code>#[maybe(async)] trait Iterator</code> effectively defines two traits, <code>Iterator</code> and <code>AsyncIterator</code>, and the <code>T: async Iterator</code> notation is being used to select the second one. (This is only an example, I don&rsquo;t mean that users would literally be able to reference a <code>AsyncIterator</code> trait.)</p>
<h3 id="what-order-are-transformers-applied">What order are transformers applied?</h3>
<p>Transformers must be written according to this grammar</p>
<pre tabindex="0"><code>Trait := async? const? Path* Path
</code></pre><p>where <code>x?</code> means optional <code>x</code>, <code>x*</code> means zero or more <code>x</code>, and the traits named in <code>Path*</code> must be auto-traits. The transformers (if present) are applied in order, so first things are made async, then const, then sendable. (I&rsquo;m not sure if both async and const make any sense?)</p>
<h3 id="can-auto-trait-transformers-let-us-genearlize-over-rcarc">Can auto-trait transformers let us genearlize over rc/arc?</h3>
<p>Yosh at some point suggested that we could think of &ldquo;send&rdquo; or &ldquo;not send&rdquo; as another application of <a href="https://blog.rust-lang.org/inside-rust/2023/02/23/keyword-generics-progress-report-feb-2023.html">keyword generics</a>, and that got me very excited. It&rsquo;s a known problem that people have to define two versions of their structs (see e.g. the <a href="https://crates.io/crates/im">im</a> and <a href="https://crates.io/crates/im-rc">im-rc</a> crates). Maybe we could permit something like</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cp">#[maybe(Send)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Shared</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="cm">/* either Rc&lt;T&gt; or Arc&lt;T&gt;, depending */</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>and then permit variables of type <code>Shared&lt;u32&gt;</code> or <code>Send Shared&lt;u32&gt;</code>. The <a href="https://blog.rust-lang.org/inside-rust/2023/02/23/keyword-generics-progress-report-feb-2023.html">keywosrd generics</a> proposals already are exploring the idea of structs whose types vary depending on whether they are async or not, so this fits in.</p>
<h2 id="conclusion">Conclusion</h2>
<p>This post covered &ldquo;trait transformers&rdquo; as a possible solution the <a href="https://smallcultfollowing.com/babysteps/
/blog/2023/02/01/async-trait-send-bounds-part-1-intro/">&ldquo;send bounds&rdquo;</a> problem. Trait transformers are not exactly an <em>alternative</em> to the <a href="https://smallcultfollowing.com/babysteps/blog/2023/02/13/return-type-notation-send-bounds-part-2/">return type notation</a> proposed earlier; they are more like a complement, in that they make the &ldquo;easy easy&rdquo;, but effectively provide a convenient desugaring to uses of <a href="https://smallcultfollowing.com/babysteps/blog/2023/02/13/return-type-notation-send-bounds-part-2/">return type notation</a>.</p>
<p>The full set of solutions thus far are&hellip;</p>
<ul>
<li><a href="https://smallcultfollowing.com/babysteps/blog/2023/02/13/return-type-notation-send-bounds-part-2/">Return type notation (RTN)</a>
<ul>
<li><em>Example:</em> <code>T: Fetch&lt;fetch(): Send&gt;</code></li>
<li><em>Pros:</em> flexible and expressive</li>
<li><em>Cons:</em> verbose</li>
</ul>
</li>
<li>eholk&rsquo;s <a href="https://blog.theincredibleholk.org/blog/2023/02/13/inferred-async-send-bounds/">inferred async send bounds</a>
<ul>
<li><em>Example:</em> <code>T: async(Send) Fetch</code></li>
<li><em>Pros:</em> concise</li>
<li><em>Cons:</em> specific to async notation, doesn&rsquo;t support <code>-&gt; impl Future</code> functions; requires RTN for completeness</li>
</ul>
</li>
<li>trait transformers (this post)
<ul>
<li><em>Example:</em> <code>T: async Send Fetch</code></li>
<li><em>Pros:</em> concise</li>
<li><em>Cons:</em> requires RTN for completeness</li>
</ul>
</li>
</ul>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>I originally planned to have part 3 of this series simply summarize those posts, in fact, but I consider Trait Transformers an evolution of those ideas, and close enough that I&rsquo;m not sure separate posts are needed.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>It&rsquo;s unclear if <code>Send Foo</code> should always convert <a href="https://rust-lang.github.io/impl-trait-initiative/RFCs/rpit-in-traits.html">RPITIT</a> return values to be <code>Send</code>, but it <em>is</em> clear that we want some way to permit one to write <code>-&gt; impl Future</code> in a trait and have that be <code>Send</code> iff async methods are <code>Send</code>.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>See what I did there?&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/send-bound-problem" term="send-bound-problem" label="Send bound problem"/></entry><entry><title type="html">Return type notation (send bounds, part 2)</title><link href="https://smallcultfollowing.com/babysteps/blog/2023/02/13/return-type-notation-send-bounds-part-2/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2023/02/13/return-type-notation-send-bounds-part-2/</id><published>2023-02-13T00:00:00+00:00</published><updated>2023-02-13T11:14:00-05:00</updated><content type="html"><![CDATA[<p>In the <a href="https://smallcultfollowing.com/babysteps/
/blog/2023/02/01/async-trait-send-bounds-part-1-intro/">previous post</a>, I introduced the “send bound” problem, which refers to the need to add a <code>Send</code> bound to the future returned by an async function. I want to start talking about some of the ideas that have been floating around for how to solve this problem. I consider this a bit of an open problem, in that I think we know a lot of the ingredients, but there is a bit of a “delicate balance” to finding the right syntax and so forth. To start with, though, I want to introduce Return Type Notation, which is an idea that Tyler Mandry and I came up with for referring to the type returned by a trait method.</p>
<h3 id="recap-of-the-problem">Recap of the problem</h3>
<p>If we have a trait <code>HealthCheck</code> that has an async function <code>check</code>…</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">HealthCheck</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">check</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">server</span>: <span class="nc">Server</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>…and then a function that is going to call that method <code>check</code> but in a parallel task…</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">start_health_check</span><span class="o">&lt;</span><span class="n">H</span><span class="o">&gt;</span><span class="p">(</span><span class="n">health_check</span>: <span class="nc">H</span><span class="p">,</span><span class="w"> </span><span class="n">server</span>: <span class="nc">Server</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">H</span>: <span class="nc">HealthCheck</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Send</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">&#39;static</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="err">…</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>…we don’t currently have a way to say that the future returned by calling <code>H::check()</code> is send. The where clause <code>H: HealthCheck + Send</code> says that the type <code>H</code> must be send, but it says nothing about the future that gets returned from calling <code>check</code>.</p>
<h3 id="core-idea-a-way-to-name-the-type-returned-by-a-function">Core idea: A way to name “the type returned by a function”</h3>
<p>The core idea of return-type notation is to let you write where-clauses that apply to <code>&lt;H as HealthCheck&gt;::check(..)</code>, which means “any return type you can get by calling <code>check</code> as defined in the impl of <code>HealthCheck</code> for <code>H</code>”. This notation is meant to be reminiscent of the fully qualified notation for associated types, e.g. <code>&lt;T as Iterator&gt;::Item</code>. Just as we usually abbreviate associated types to <code>T::Item</code>, you would also typically abbreviate return type notation to <code>H::check(..)</code>. The trait name is only needed when there is ambiguity.</p>
<p>Here is an example of how <code>start_health_check</code> would look using this notation:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">start_health_check</span><span class="o">&lt;</span><span class="n">H</span><span class="o">&gt;</span><span class="p">(</span><span class="n">health_check</span>: <span class="nc">H</span><span class="p">,</span><span class="w"> </span><span class="n">server</span>: <span class="nc">Server</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">H</span>: <span class="nc">HealthCheck</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Send</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">&#39;static</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">H</span>::<span class="n">check</span><span class="p">(</span><span class="o">..</span><span class="p">)</span>: <span class="nb">Send</span><span class="p">,</span><span class="w"> </span><span class="c1">// &lt;— return type notation
</span></span></span></code></pre></div><p>Here the where clause <code>H::check(..): Send</code> means “the type(s) returned when you call <code>H::check</code> must be <code>Send</code>. Since async functions return a future, this means that future must implement <code>Send</code>.</p>
<h2 id="more-compact-notation">More compact notation</h2>
<p>Although it has not yet been stabilized, <a href="https://rust-lang.github.io/rfcs/2289-associated-type-bounds.html?highlight=associated#">RFC #2289</a> proposed a shorthand way to write bounds on associated types; something like <code>T: Iterator&lt;Item: Send&gt;</code> means “<code>T</code> implements <code>Iterator</code> and its associated type <code>Item</code> implements <code>Send</code>”. We can apply that same sugar to return-type notations:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">start_health_check</span><span class="o">&lt;</span><span class="n">H</span><span class="o">&gt;</span><span class="p">(</span><span class="n">health_check</span>: <span class="nc">H</span><span class="p">,</span><span class="w"> </span><span class="n">server</span>: <span class="nc">Server</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">H</span>: <span class="nc">HealthCheck</span><span class="o">&lt;</span><span class="n">check</span><span class="p">(</span><span class="o">..</span><span class="p">)</span>: <span class="nb">Send</span><span class="o">&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Send</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">&#39;static</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//             ^^^^^^^^^
</span></span></span></code></pre></div><p>This is more concise, though also clearly kind of repetitive. (When I read it, I think “how many dang times do I have to write <code>Send</code>?” But for now we’re just trying to explore the idea, not evaluate its downsides, so let’s hold on that thought.)</p>
<h2 id="futures-capture-their-arguments">Futures capture their arguments</h2>
<p>Note that the where clause we wrote was</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">H</span>::<span class="n">check</span><span class="p">(</span><span class="o">..</span><span class="p">)</span>: <span class="nb">Send</span>
</span></span></code></pre></div><p>and not</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">H</span>::<span class="n">check</span><span class="p">(</span><span class="o">..</span><span class="p">)</span>: <span class="nb">Send</span> <span class="o">+</span><span class="w"> </span><span class="err">‘</span><span class="k">static</span><span class="w">
</span></span></span></code></pre></div><p>Moreover, if we were to add a <code>'static</code> bound, the program would not compile. Why is that? The reason is that async functions in Rust desugar to returning a future that captures all of the function’s arguments:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">HealthCheck</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// async fn check(&amp;mut self, server: Server);
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">check</span><span class="o">&lt;</span><span class="err">‘</span><span class="n">s</span><span class="o">&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="err">’</span><span class="n">s</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">server</span>: <span class="nc">Server</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">impl</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">()</span><span class="o">&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="err">‘</span><span class="n">s</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//           ^^^^^^^^^^^^                                                ^^
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//         The future captures `self`, so it requires the lifetime bound `&#39;s` 
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Because the future being returned captures <code>self</code>, and <code>self</code> has type <code>&amp;’s mut Self</code>, the <code>Future</code> returned must capture <code>’s</code>. Therefore, it is not <code>’static</code>, and so the where-clause <code>H::check(..): Send + ‘static</code> doesn’t hold for all possible calls to <code>check</code>, since you are not required to give an argument of type <code>&amp;’static mut Self</code>.</p>
<h2 id="rtn-with-specific-parameter-types">RTN with specific parameter types</h2>
<p>Most of the time, you would use RTN to bound all possible return values from the function. But sometimes you might want to be more specific, and talk just about the return value for some specific argument types. As a silly example, we could have a function like</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">call_check_with_static</span><span class="o">&lt;</span><span class="n">H</span><span class="o">&gt;</span><span class="p">(</span><span class="n">h</span>: <span class="kp">&amp;</span><span class="err">’</span><span class="k">static</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">H</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">   </span><span class="n">H</span>: <span class="nc">HealthCheck</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="err">‘</span><span class="k">static</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">   </span><span class="n">H</span>::<span class="n">check</span><span class="p">(</span><span class="o">&amp;</span><span class="err">’</span><span class="k">static</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">H</span><span class="p">,</span><span class="w"> </span><span class="n">Server</span><span class="p">)</span>: <span class="err">‘</span><span class="k">static</span><span class="p">,</span><span class="w">
</span></span></span></code></pre></div><p>This function has a generic parameter <code>H</code> that is <code>’static</code> and it gets a <code>&amp;’static mut H</code> as argument. The where clause <code>H::check(&amp;’static mut H, Server): ‘static</code> then says: if I call <code>check</code> with the argument <code>&amp;’static mut H</code>, it will return a <code>‘static</code> future. In contrast to the previous section, where we were talking about any possible return value from <code>check</code>, this where-clause is true and valid.</p>
<h2 id="desugaring-rtn-to-associated-types">Desugaring RTN to associated types</h2>
<p>To understand what RTN does, it’s best to think of the desugaring from async functions to associated types. This desugaring is exactly how Rust works internally, but we are not proposing to expose it to users directly, for reasons I’ll elaborate in a bit.</p>
<p>We saw earlier how an <code>async fn</code> desugars to a function that returns <code>impl Future</code>. Well, in a trait, returning <code>impl Future</code> can itself be desugared to a trait with a(generic) associated type:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">HealthCheck</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// async fn check(&amp;mut self, server: Server);
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Check</span><span class="o">&lt;</span><span class="err">‘</span><span class="n">t</span><span class="o">&gt;</span>: <span class="nc">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">()</span><span class="o">&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="err">‘</span><span class="n">t</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">check</span><span class="o">&lt;</span><span class="err">‘</span><span class="n">s</span><span class="o">&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="err">’</span><span class="n">s</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">server</span>: <span class="nc">Server</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Check</span><span class="o">&lt;</span><span class="err">‘</span><span class="n">s</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>When we write a where-clause like <code>H::check(..): Send</code>, that is then effectively a bound on this hidden associated type <code>Check</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">start_health_check</span><span class="o">&lt;</span><span class="n">H</span><span class="o">&gt;</span><span class="p">(</span><span class="n">health_check</span>: <span class="nc">H</span><span class="p">,</span><span class="w"> </span><span class="n">server</span>: <span class="nc">Server</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">H</span>: <span class="nc">HealthCheck</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Send</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">&#39;static</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="o">&lt;</span><span class="err">‘</span><span class="n">a</span><span class="o">&gt;</span><span class="w"> </span><span class="n">H</span>::<span class="n">Check</span><span class="o">&lt;</span><span class="err">‘</span><span class="n">a</span><span class="o">&gt;</span>: <span class="nb">Send</span><span class="p">,</span><span class="w"> </span><span class="c1">// &lt;— equivalent to `H::check(..): Send`
</span></span></span></code></pre></div><h2 id="generic-methods">Generic methods</h2>
<p>It is also possible to have generic async functions in traits. Imagine that instead of <code>HealthCheck</code> taking a specific <code>Server</code> type, we wanted to accept any type that implements the trait <code>ServerTrait</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">HealthCheckGeneric</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">check_gen</span><span class="o">&lt;</span><span class="n">S</span>: <span class="nc">ServerTrait</span><span class="o">&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">server</span>: <span class="nc">S</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>We can still think of this trait as desugaring to a trait with an associated type:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">HealthCheckGeneric</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// async fn check&lt;S&gt;(&amp;mut self, server: S) where S: ServerTrait,
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">CheckGen</span><span class="o">&lt;</span><span class="err">‘</span><span class="n">t</span><span class="p">,</span><span class="w"> </span><span class="n">S</span>: <span class="nc">ServerTrait</span><span class="o">&gt;</span>: <span class="nc">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">()</span><span class="o">&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="err">‘</span><span class="n">t</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">   </span><span class="k">fn</span> <span class="nf">check_gen</span><span class="w"> </span><span class="o">&lt;</span><span class="err">‘</span><span class="n">s</span><span class="p">,</span><span class="w"> </span><span class="n">S</span>: <span class="nc">ServerTrait</span><span class="o">&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="err">’</span><span class="n">s</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">server</span>: <span class="nc">Server</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">CheckGen</span><span class="o">&lt;</span><span class="err">‘</span><span class="n">s</span><span class="p">,</span><span class="w"> </span><span class="n">S</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>But if we want to write a where-clause like <code>H::check_gen(..): Send</code>, this would require us to support higher-ranked trait bounds over <em>types</em> and not just lifetimes:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">start_health_check</span><span class="o">&lt;</span><span class="n">H</span><span class="o">&gt;</span><span class="p">(</span><span class="n">health_check</span>: <span class="nc">H</span><span class="p">,</span><span class="w"> </span><span class="n">server</span>: <span class="nc">Server</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">H</span>: <span class="nc">HealthCheckGeneric</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Send</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">&#39;static</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="o">&lt;</span><span class="err">‘</span><span class="n">a</span><span class="p">,</span><span class="w"> </span><span class="n">S</span><span class="o">&gt;</span><span class="w"> </span><span class="n">H</span>::<span class="n">CheckGen</span><span class="o">&lt;</span><span class="err">‘</span><span class="n">a</span><span class="p">,</span><span class="w"> </span><span class="n">S</span><span class="o">&gt;</span>: <span class="nb">Send</span><span class="p">,</span><span class="w"> </span><span class="c1">// &lt;—
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//     ^ for all types S…
</span></span></span></code></pre></div><p>As it happens, this sort of where-clause is something the <a href="https://blog.rust-lang.org/2023/01/20/types-announcement.html">types team</a> is working on in our new solver design. I’m going to skip over the details, as it’s kind of orthogonal to the topic of how to write <code>Send</code> bounds.</p>
<p>One final note: just as you can specify a particular value for the argument types, you should be able to use turbofish to specify the value for generic parameters. So something like <code>H::check_gen::&lt;MyServer&gt;(..): Send</code> would mean “whenever you call <code>check_gen</code> on <code>H</code> with <code>S = MyServer</code>, the return type is <code>Send</code>”.</p>
<h2 id="using-rtn-outside-of-where-clauses">Using RTN outside of where-clauses</h2>
<p>So far, all the examples I’ve shown you for RTN involved a where-clause. That is the most important context, but it should be possible to write RTN types any place you write a type. For the most part, this is just fine, but using the <code>..</code> notation outside of a where-clause introduces some additional complications. Think of <code>H::check</code> — the precise type that is returned will depend on the lifetime of the first argument. So we could have one type <code>H::check(&amp;’a mut H, Server)</code> and the return value would reference the lifetime <code>’a</code>, but we could also have <code>H::check(&amp;’b mut H, Server)</code>, and the return value would reference the lifetime <code>’b</code>. The <code>..</code> notation really names a <em>range</em> of types. For the time being, I think we would simply say that <code>..</code> is not allowed outside of a where-clause, but there are ways that you could make it make sense (e.g., it might be valid only when the return type doesn’t depend on the types of the parameters).</p>
<h2 id="frequently-asked-questions">“Frequently asked questions”</h2>
<p>That sums up our tour of the “return-type-notation” idea. In short:</p>
<ul>
<li>You can write bounds like <code>&lt;T as Trait&gt;::method(..): Send</code> in a where-clause to mean “the method <code>method</code> from the impl of <code>Trait</code> for <code>T</code> returns a value that is <code>Send</code>, no matter what parameters I give it”.</li>
<li>Like an associated type, this would more commonly be written <code>T::method(..)</code>, with the trait automatically determined.</li>
<li>You could also specify precise types for the parameters and/or generic types, like <code>T::method(U, V)</code>.</li>
</ul>
<p>Let’s dive into some of the common questions about this idea.</p>
<h3 id="why-not-just-expose-the-desugared-associated-type-directly">Why not just expose the desugared associated type directly?</h3>
<p>Earlier I explained how <code>H::check(..)</code> would work by desugaring it to an associated type. So, why not just have users talk about that associated type directly, instead of adding a new notation for “the type returned by <code>check</code>”? The main reason is that it would require us to expose details about this desugaring that we don’t necessarily want to expose.</p>
<p>The most obvious detail is “what is the name of the associated type” — I think the only clear choice is to have it have the same name as the method itself, which is slightly backwards incompatible (since one can have a trait with an associated type and a method that has the same name), but easy enough to do over an edition.</p>
<p>We would also have to expose what generic parameters this associated type has. This is not always so simple. For example, consider this trait:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Dump</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">   </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">dump</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">data</span>: <span class="kp">&amp;</span><span class="nc">impl</span><span class="w"> </span><span class="n">Debug</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>If we want to desugar this to an associated type, what generics should that type have?</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Dump</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Dump</span><span class="o">&lt;</span><span class="err">…</span><span class="o">&gt;</span>: <span class="nc">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">()</span><span class="o">&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="err">…</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//        ^^^ how many generics go here?
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">dump</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">data</span>: <span class="kp">&amp;</span><span class="nc">impl</span><span class="w"> </span><span class="n">Debug</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Dump</span><span class="o">&lt;</span><span class="err">…</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This function has two sources of “implicit” generic parameters: elided lifetimes and the <code>impl Trait</code> argument. One desugaring would be:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Dump</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Dump</span><span class="o">&lt;</span><span class="err">‘</span><span class="n">a</span><span class="p">,</span><span class="w"> </span><span class="err">‘</span><span class="n">b</span><span class="p">,</span><span class="w"> </span><span class="n">D</span>: <span class="nc">Debug</span><span class="o">&gt;</span>: <span class="nc">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">()</span><span class="o">&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="err">‘</span><span class="n">a</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="err">‘</span><span class="n">b</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">   </span><span class="k">fn</span> <span class="nf">dump</span><span class="o">&lt;</span><span class="err">‘</span><span class="n">a</span><span class="p">,</span><span class="w"> </span><span class="err">‘</span><span class="n">b</span><span class="p">,</span><span class="w"> </span><span class="n">D</span>: <span class="nc">Debug</span><span class="o">&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="err">’</span><span class="n">a</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">data</span>: <span class="kp">&amp;</span><span class="err">’</span><span class="n">b</span><span class="w"> </span><span class="n">D</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Dump</span><span class="o">&lt;</span><span class="err">‘</span><span class="n">a</span><span class="p">,</span><span class="w"> </span><span class="err">‘</span><span class="n">b</span><span class="p">,</span><span class="w"> </span><span class="n">D</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>But, in this case, we could also have a simpler desugaring that uses just one lifetime parameter (this isn’t always the case):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Dump</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Dump</span><span class="o">&lt;</span><span class="err">‘</span><span class="n">a</span><span class="p">,</span><span class="w"> </span><span class="n">D</span>: <span class="nc">Debug</span><span class="o">&gt;</span>: <span class="nc">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">()</span><span class="o">&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="err">‘</span><span class="n">a</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">   </span><span class="k">fn</span> <span class="nf">dump</span><span class="o">&lt;</span><span class="err">‘</span><span class="n">a</span><span class="p">,</span><span class="w"> </span><span class="n">D</span>: <span class="nc">Debug</span><span class="o">&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="err">’</span><span class="n">a</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">data</span>: <span class="kp">&amp;</span><span class="err">’</span><span class="n">a</span><span class="w"> </span><span class="n">D</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Dump</span><span class="o">&lt;</span><span class="err">‘</span><span class="n">a</span><span class="p">,</span><span class="w"> </span><span class="n">D</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Regardless of how we expose the lifetimes, the <code>impl Trait</code> argument also raises interesting questions. In ordinary functions, the lang-team generally favors not including <code>impl Trait</code> arguments in the list of generics (i.e., they can’t be specified by turbofish, their values are inferred from the argument types), although we’ve not reached a final decision there. That seems inconsistent with exposing the type parameter <code>D</code>.</p>
<p>All in all, the appeal of the RTN is that it skips over these questions, leaving the compiler room to desugar in any of the various equivalent ways. It also means users don’t have to understand the desugaring, and can just think about the “return value of check”.</p>
<h3 id="should-hcheck-send-mean-that-the-future-is-send-or-the-result-of-the-future">Should <code>H::check(..): Send</code> mean that the <em>future</em> is <code>Send</code>, or the result of the future?</h3>
<p>Some folks have pointed out that <code>H::check(..): Send</code> seems like it refers to the value you get from <em>awaiting</em> <code>check</code>, and not the future itself. This is particularly true since our async function notation doesn’t write the future explicitly, unlike (say) C# or TypeScript (in those languages, an <code>async fn</code> must return a task or promise type). This seems true, it <em>will</em> likely be a source of confusion — but it’s also consistent with how async functions work. For example:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Get</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">get</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">u32</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">bar</span><span class="o">&lt;</span><span class="n">G</span>: <span class="nc">Get</span><span class="o">&gt;</span><span class="p">(</span><span class="n">g</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">G</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">f</span>: <span class="nc">impl</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kt">u32</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">g</span><span class="p">.</span><span class="n">get</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>In this code, even though <code>g.get()</code> is declared to return <code>u32</code>, <code>f</code> is a future, not an integer. Writing <code>G::get(..): Send</code> thus talks about the <em>future</em>, not the integer.</p>
<h3 id="isnt-rtn-kind-of-verbose">Isn’t RTN kind of verbose?</h3>
<p>Interesting fact: when I talk to people about what is confusing in Rust, the trait system ranks as high or higher than the borrow checker. If we take another look at our motivation example, I think we can start to see why:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">start_health_check</span><span class="o">&lt;</span><span class="n">H</span><span class="o">&gt;</span><span class="p">(</span><span class="n">health_check</span>: <span class="nc">H</span><span class="p">,</span><span class="w"> </span><span class="n">server</span>: <span class="nc">Server</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">H</span>: <span class="nc">HealthCheck</span><span class="o">&lt;</span><span class="n">check</span><span class="p">(</span><span class="o">..</span><span class="p">)</span>: <span class="nb">Send</span><span class="o">&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Send</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">&#39;static</span><span class="p">,</span><span class="w">
</span></span></span></code></pre></div><p>That where-clause basically just says “<code>H</code> is safe to use from other threads”, but it requires a pretty dense bit of notation! (And, of course, also demonstrates that the borrow checker and the trait system are not independent things, since <code>’static</code> can be seen as a part of both, and is certainly a common source of confusion.) Wouldn’t it be nice if we had a more compact way to say that?</p>
<p>Now imagine you have a trait with a lot of methods:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">AsyncOps</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">op1</span><span class="p">(</span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">op2</span><span class="p">(</span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">op3</span><span class="p">(</span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Under the current proposal, to create an <code>AsyncOps</code> that can be (fully) used across threads, one would write:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">do_async_ops</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span><span class="p">(</span><span class="n">health_check</span>: <span class="nc">H</span><span class="p">,</span><span class="w"> </span><span class="n">server</span>: <span class="nc">Server</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">A</span>: <span class="nc">AsyncOps</span><span class="o">&lt;</span><span class="n">op1</span><span class="p">(</span><span class="o">..</span><span class="p">)</span>: <span class="nb">Send</span><span class="p">,</span><span class="w"> </span><span class="n">op2</span><span class="p">(</span><span class="o">..</span><span class="p">)</span>: <span class="nb">Send</span><span class="p">,</span><span class="w"> </span><span class="n">op3</span><span class="p">(</span><span class="o">..</span><span class="p">)</span>: <span class="nb">Send</span><span class="o">&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Send</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">&#39;static</span><span class="p">,</span><span class="w">
</span></span></span></code></pre></div><p>You could use a trait alias (if we stabilized them) to help here, but still, this seems like a problem!</p>
<h3 id="but-maybe-that-verbosity-is-useful">But maybe that verbosity is useful?</h3>
<p>Indeed! RTN is a very flexible notation. To continue with the <code>AsyncOps</code> example, we could write a function that says &ldquo;the future returned by <code>op1</code> must be send, but not the others&rdquo;, which would be useful for a function like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">do_op1_in_parallel</span><span class="p">(</span><span class="n">a</span>: <span class="nc">impl</span><span class="w"> </span><span class="n">AsyncOps</span><span class="o">&lt;</span><span class="n">op1</span><span class="p">(</span><span class="o">..</span><span class="p">)</span>: <span class="nb">Send</span> <span class="o">+</span><span class="w"> </span><span class="nb">&#39;static</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//                                       ^^^^^^^^^^^^^^^^^^^^^^^
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//                                       Return value of `op1` must be Send, static
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">tokio</span>::<span class="n">spawn</span><span class="p">(</span><span class="n">a</span><span class="p">.</span><span class="n">op1</span><span class="p">()).</span><span class="k">await</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="is-rtn-limited-to-async-fn-in-traits">Is RTN limited to async fn in traits?</h3>
<p>All my examples have focused on async fn in traits, but we can use RTN to name the return types of any function anywhere. For example, given a function like <code>get</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">get</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">impl</span><span class="w"> </span><span class="nb">FnOnce</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="kt">u32</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">move</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="mi">22</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>we could allow you to write <code>get()</code> to name name the closure type that is returned:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">c</span>: <span class="nc">get</span><span class="p">()</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">get</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">d</span>: <span class="kt">u32</span> <span class="o">=</span><span class="w"> </span><span class="n">c</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This seems like it would be useful for things like iterator combinators, so that you can say things like “the iterator returned by calling <code>map</code> is <code>Send</code>”.</p>
<h3 id="why-do-we-have-to-write-">Why do we have to write <code>..</code>?</h3>
<p>OK, nobody asks this, but I do sometimes feel that writing <code>..</code> just seems silly. We could say that you just write <code>H::check(): Send</code> to mean &ldquo;for all parameters&rdquo;. (In the case where the method has no parameters, then &ldquo;for all parameters&rdquo; is satisfied trivially.) That doesn’t change anything fundamental about the proposal but it lightens the “line noise” aspect a tad:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">start_health_check</span><span class="o">&lt;</span><span class="n">H</span><span class="o">&gt;</span><span class="p">(</span><span class="n">health_check</span>: <span class="nc">H</span><span class="p">,</span><span class="w"> </span><span class="n">server</span>: <span class="nc">Server</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">H</span>: <span class="nc">HealthCheck</span><span class="o">&lt;</span><span class="n">check</span><span class="p">()</span>: <span class="nb">Send</span><span class="o">&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Send</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">&#39;static</span><span class="p">,</span><span class="w">
</span></span></span></code></pre></div><p>It does introduce some ambiguity. Did the user mean “for all parameters” or did they forget that <code>check()</code> has parameters? I’m not sure how this confusion is harmful, though. The main way I can see it coming about is something like this:</p>
<ul>
<li><code>check()</code> initially has zero parameters, and the user writes <code>check(): Send</code>.</li>
<li>In a later version of the program, a parameter is added, and now the meaning of <code>check</code> changes to “for all parameters” (although, as we noted before, that was arguably the meaning before).</li>
</ul>
<p>There is a shift happening here, but what harm can it do? If the check still passes, then <code>check(T): Send</code> is true for any <code>T</code>. If it doesn’t, the user gets an error has to add an explicit type for this new parameter.</p>
<h3 id="can-we-really-handle-this-in-our-trait-solver">Can we really handle this in our trait solver?</h3>
<p>As we saw when discussing generic methods, handling this feature in its full generality is a bit much for our trait solver today. But we could begin with a subset &ndash; for example, the notation can only be used in where-clauses and only for methods that are generic over lifetime parameters and not types. Tyler and I worked out a subset we believe would be readily implementable.</p>
<h2 id="conclusion">Conclusion</h2>
<p>This post introduced return-type notation, an extension to the type grammar that allows you to refer to the return type of a trait method, and covered some of the pros/cons. Here is a rundown:</p>
<p><strong>Pros:</strong></p>
<ul>
<li>Extremely flexible notation that lets us say precisely which methods must return <code>Send</code> types, and even lets us go into detail about which argument types they will be called with.</li>
<li>Avoids having to specify a desugaring to associated types precisely. For example, we don’t have to decide how to name that type, nor do we have to decide how many lifetime parameters it has, or whether <code>impl Trait</code> arguments become type parameters.</li>
<li>Can be used to refer to return values of things beyond async functions.</li>
</ul>
<p><strong>Cons:</strong></p>
<ul>
<li>New concept for users to learn — now they have associated types as well as associated return types.</li>
<li>Verbose even for common cases; doesn’t scale up to traits with many methods.</li>
</ul>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/send-bound-problem" term="send-bound-problem" label="Send bound problem"/></entry><entry><title type="html">Async trait send bounds, part 1: intro</title><link href="https://smallcultfollowing.com/babysteps/blog/2023/02/01/async-trait-send-bounds-part-1-intro/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2023/02/01/async-trait-send-bounds-part-1-intro/</id><published>2023-02-01T00:00:00+00:00</published><updated>2023-02-01T08:06:00-05:00</updated><content type="html"><![CDATA[<p>Nightly Rust now has <a href="https://blog.rust-lang.org/inside-rust/2022/11/17/async-fn-in-trait-nightly.html">support for async functions in traits</a>, so long as you limit yourself to static dispatch. That’s super exciting! And yet, for many users, this support won’t yet meet their needs. One of the problems we need to resolve is how users can conveniently specify when they need an async function to return a <code>Send</code> future. This post covers some of the background on send futures, why we don&rsquo;t want to adopt the solution from the <code>async_trait</code> crate for the language, and the general direction we would like to go. Follow-up posts will dive into specific solutions.</p>
<h2 id="why-do-we-care-about-send-bounds">Why do we care about Send bounds?</h2>
<p>Let’s look at an example. Suppose I have an async trait for performs some kind of periodic health check on a given server:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">HealthCheck</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">check</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">server</span>: <span class="kp">&amp;</span><span class="nc">Server</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">bool</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now suppose we want to write a function that, given a <code>HealthCheck</code>, starts a parallel task that runs that check every second, logging failures. This might look like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">start_health_check</span><span class="o">&lt;</span><span class="n">H</span><span class="o">&gt;</span><span class="p">(</span><span class="n">health_check</span>: <span class="nc">H</span><span class="p">,</span><span class="w"> </span><span class="n">server</span>: <span class="nc">Server</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">H</span>: <span class="nc">HealthCheck</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Send</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">&#39;static</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">tokio</span>::<span class="n">spawn</span><span class="p">(</span><span class="k">async</span><span class="w"> </span><span class="k">move</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">while</span><span class="w"> </span><span class="n">health_check</span><span class="p">.</span><span class="n">check</span><span class="p">(</span><span class="o">&amp;</span><span class="n">server</span><span class="p">).</span><span class="k">await</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">tokio</span>::<span class="n">time</span>::<span class="n">sleep</span><span class="p">(</span><span class="n">Duration</span>::<span class="n">from_secs</span><span class="p">(</span><span class="mi">1</span><span class="p">)).</span><span class="k">await</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">emit_failure_log</span><span class="p">(</span><span class="o">&amp;</span><span class="n">server</span><span class="p">).</span><span class="k">await</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>So far so good! So what happens if we try to compile this? <a href="https://play.rust-lang.org/?version=nightly&amp;mode=debug&amp;edition=2021&amp;gist=a4a2cf7b541a4c7b89eac1a3ddd8596d">You can try it yourself if you use the <code>async_fn_in_trait</code> feature gate</a>, you should see a compilation error like so:</p>
<pre tabindex="0"><code>error: future cannot be sent between threads safely
   --&gt; src/lib.rs:15:18
    |
15  |       tokio::spawn(async move {
    |  __________________^
16  | |         while health_check.check(&amp;server).await {
17  | |             tokio::time::sleep(Duration::from_secs(1)).await;
18  | |         }
19  | |         emit_failure_log(&amp;server).await;
20  | |     });
    | |_____^ future created by async block is not `Send`
    |
    = help: within `[async block@src/lib.rs:15:18: 20:6]`, the trait `Send` is not implemented for `impl Future&lt;Output = bool&gt;`
</code></pre><p>The error is saying that the future for our task cannot be sent between threads. But why not? After all,  the <code>health_check</code> value is both <code>Send</code> and <code>’static</code>, so we know that <code>health_check</code> is safe to send it over to the new thread. But the problem lies elsewhere. The error has an attached note that points it out to us:</p>
<pre tabindex="0"><code>note: future is not `Send` as it awaits another future which is not `Send`
   --&gt; src/lib.rs:16:15
    |
16  |         while health_check.check(&amp;server).await {
    |               ^^^^^^^^^^^^^^^^^^^^^^^^^^^ await occurs here
</code></pre><p>The problem is that the call to <code>check</code> is going to return a future, and that future is not known to be <code>Send</code>. To see this more clearly, let’s desugar the <code>HealthCheck</code> trait slightly:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">HealthCheck</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// async fn check(&amp;mut self, server: &amp;Server) -&gt; bool;
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">check</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">server</span>: <span class="kp">&amp;</span><span class="nc">Server</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">impl</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kt">bool</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                                           </span><span class="c1">// ^ Problem is here! This returns a future, but not necessarily a `Send` future.
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The problem is that <code>check</code> returns an <code>impl Future</code>, but the trait doesn’t say whether this future is <code>Send</code> or not. The compiler therefore sees that our task is going to be awaiting a future, but that future might not be sendable between threads.</p>
<h2 id="what-does-the-async-trait-crate-do">What does the async-trait crate do?</h2>
<p>Interestingly, if you rewrite the above example to use the <code>async_trait</code> crate, <a href="https://play.rust-lang.org/?version=nightly&amp;mode=debug&amp;edition=2021&amp;gist=c399a94d05e9e278ba7f6f97cd03afa7">it compiles</a>. What’s going on here? The answer is that the <code>async_trait</code> proc macro uses a different desugaring. Instead of creating a trait that yields <code>-&gt; impl Future</code>, it creates a trait that returns a <code>Pin&lt;Box&lt;dyn Future + Send&gt;&gt;</code>. This means that the future can be sent between threads; it also means that the trait is dyn-safe.</p>
<p>This is a good answer for the <code>async-trait</code> crate, but it’s not a good answer for a core language construct as it loses key flexibility. We want to support async in single-threaded executors, where the <code>Send</code> bound is irrelevant, and we also to support async in no-std applications, where <code>Box</code> isn’t available. Moreover, we want to have key interop traits (e.g., <code>Read</code>) that can be used for all three of those applications at the same time. An approach like the used in <code>async-trait</code> cannot support a trait that works for all three of those applications at once.</p>
<h2 id="how-would-we-like-to-solve-this">How would we like to solve this?</h2>
<p>Instead of having the trait specify whether the returned future is <code>Send</code> (or boxed, for that matter), our preferred solution is to have the <code>start_health_check</code> function declare that it requires <code>check</code> to return a sendable future. Remember that <code>health_check</code> already included a where clause specifying that the type <code>H</code> was sendable across threads:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">start_health_check</span><span class="o">&lt;</span><span class="n">H</span><span class="o">&gt;</span><span class="p">(</span><span class="n">health_check</span>: <span class="nc">H</span><span class="p">,</span><span class="w"> </span><span class="n">server</span>: <span class="nc">Server</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">H</span>: <span class="nc">HealthCheck</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Send</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">&#39;static</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// —————  ^^^^^^^^^^^^^^ “sendable to another disconnected thread”
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//     |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Implements the `HealthCheck` trait
</span></span></span></code></pre></div><p>Right now, this where clause says two independent things:</p>
<ul>
<li><code>H</code> implements <code>HealthCheck</code>;</li>
<li>values of type <code>H</code> can be sent to an independent task, which is really a combination of two things
<ul>
<li>type <code>H</code> can be sent between threads (<code>H: Send</code>)</li>
<li>type <code>H</code> contains no references to the current stack (<code>H: ‘static</code>)</li>
</ul>
</li>
</ul>
<p>What we want is to add syntax to specify an additional condition:</p>
<ul>
<li><code>H</code> implements <code>HealthCheck</code> <strong>and its check method returns a <code>Send</code> future</strong></li>
</ul>
<p>In other words, we don’t want just any type that implements <code>HealthCheck</code>. We specifically want a type that implements <code>HealthCheck</code> and returns a <code>Send</code> future.</p>
<p>Note the contrast to the desugaring approach used in the <code>async_trait</code> crate: in that approach, we changed what it means to implement <code>HealthCheck</code> to always require a sendable future. In this approach, we allow the trait to be used in both ways, but allow the function to say when it needs sendability or not.</p>
<p>The approach of “let the function specify what it needs” is very in-line with Rust. In fact, the existing where-clause demonstrates the same pattern. We don’t say that implementing <code>HealthCheck</code> implies that <code>H</code> is <code>Send</code>, rather we say that the trait can be implemented by any type, but allow the function to specify that <code>H</code> must be both <code>HealthCheck</code> <em>and</em> <code>Send</code>.</p>
<h2 id="next-post-lets-talk-syntax">Next post: Let’s talk syntax</h2>
<p>I’m going to leave you on a cliffhanger. This blog post setup the problem we are trying to solve: for traits with async functions, <strong>we need some kind of syntax for declaring that you want an implementation that returns <code>Send</code> futures, and not just <em>any</em> implementation</strong>. In the next set of posts, I’ll walk through our proposed solution to this, and some of the other approaches we’ve considered and rejected.</p>
<h2 id="appendix-why-does-the-returned-future-have-to-be-send-anyway">Appendix: Why does the returned future have to be send anyway?</h2>
<p>Some of you may wonder why it matters that the future returned is not <code>Send</code>. After all, the only thing we are actually sending between threads is <code>health_check</code> — the future is being created on the new thread itself, when we call <code>check</code>. It <em>is</em> a bit surprising, but this is actually highlighting an area where async tasks are different from threads (and where we might consider future language extensions).</p>
<p>Async is intended to support a number of different task models:</p>
<ul>
<li>Single-threaded: all tasks run in the same OS thread. This is a great choice for embedded systems, or systems where you have lightweight processes (e.g., <a href="https://fuchsia.dev">Fuchsia</a><sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>).</li>
<li><a href="https://dl.acm.org/doi/10.1145/564870.564900">Work-dealing</a>, sometimes called <a href="https://www.datadoghq.com/blog/engineering/introducing-glommio/">thread-per-core</a>: tasks run in multiple threads, but once a task starts in a thread, it never moves again.</li>
<li><a href="https://en.wikipedia.org/wiki/Work_stealing">Work-stealing</a>: tasks start in one thread, but can migrate between OS threads while they execute.</li>
</ul>
<p>Tokio’s <code>spawn</code> function supports the final mode (work-stealing). The key point here is that the future can  move between threads at any <code>await</code> point. This means that it’s possible for the future to be moved between threads while awaiting the future returned by <code>check</code>. Therefore, <strong>any data in this future must be <code>Send</code></strong>.</p>
<p>This might be surprising. After all, the most common example of non-send data is something like a (non-atomic) <code>Rc</code>. It would be fine to create an <code>Rc</code> within one async task and then move that task to another thread, so long as the task is paused at the point of move. But there are other non-<code>Send</code> types that wouldn’t work so well. For example, you might make a type that relies on thread-local storage; such a type would not be <code>Send</code> because it’s only safe to use it on the thread in which it was created. If that type were moved between threads, the system could break.</p>
<p>In the future, it might be useful to separate out types like <code>Rc</code> from other <code>Send</code> types. The distinguishing characteristic is that <code>Rc</code> can be moved between threads so long as all possible aliases are also moved at the same time. Other types are really tied to a <em>specific</em> thread. There’s no example in the stdlib that comes to mind, but it seems like a valid pattern for Rust today that I would like to continue supporting. I’m not sure yet the right way to think about that!</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>I have finally learned how to spell this word without having to look it up! 💪&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/send-bound-problem" term="send-bound-problem" label="Send bound problem"/></entry><entry><title type="html">Rust in 2023: Growing up</title><link href="https://smallcultfollowing.com/babysteps/blog/2023/01/20/rust-in-2023-growing-up/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2023/01/20/rust-in-2023-growing-up/</id><published>2023-01-20T00:00:00+00:00</published><updated>2023-01-20T08:08:00-05:00</updated><content type="html"><![CDATA[<p>When I started working on Rust in 2011, my daughter was about three months old. She’s now in sixth grade, and she’s started growing rapidly. Sometimes we wake up to find that her clothes don’t quite fit anymore: the sleeves might be a little too short, or the legs come up to her ankles. Rust is experiencing something similar. We’ve been growing tremendously fast over the last few years, and any time you experience growth like that, there are bound to be a few rough patches. Things that don’t work as well as they used to. This holds both in a technical sense — there are parts of the language that don’t seem to scale up to Rust’s current size — and in a social one — some aspects of how the projects runs need to change if we’re going to keep growing the way I think we should. As we head into 2023, with two years to go until the Rust 2024 edition, this is the theme I see for Rust: <strong>maturation and scaling</strong>.</p>
<h2 id="tldr">TL;DR</h2>
<p>In summary, these are (some of) the things I think are most important for Rust in 2023:</p>
<ul>
<li>Implementing <strong><a href="https://smallcultfollowing.com/babysteps/blog/2022/09/22/rust-2024-the-year-of-everywhere/">“the year of everywhere”</a></strong> so that you can make any function async, write <code>impl Trait</code> just about anywhere, and fully utilize generic associated types; planning for the Rust 2024 edition.</li>
<li>Beginning work on a <strong>Rust specification</strong> and integrating it into our processes.</li>
<li>Defining rules for <strong>unsafe code</strong> and smooth tooling to check whether you’re following them.</li>
<li>Supporting efforts to <strong>teach Rust</strong> in universities and elsewhere.</li>
<li>Improving our <strong>product planning</strong> and <strong>user feedback</strong> processes.</li>
<li>Refining our <strong>governance structure</strong> with specialized teams for dedicated areas, more scalable structure for broad oversight, and more intensional onboarding.</li>
</ul>
<h2 id="the-year-of-everywhere-and-the-2024-edition">“The year of everywhere” and the 2024 edition</h2>
<p>What do async-await, impl Trait, and generic parameters have in common? They’re all essential parts of modern Rust, that’s one thing. They’re also all, in my opinion, in a “minimum viable product” state. Each of them has some key limitations that make them less useful and more confusing than they have to be. As I wrote in <a href="https://smallcultfollowing.com/babysteps/blog/2022/09/22/rust-2024-the-year-of-everywhere/">“Rust 2024: The Year of Everywhere”</a>, there are currently a lot of folks working hard to lift those limitations through a number of extensions:</p>
<ul>
<li>Generic associated types (<a href="https://blog.rust-lang.org/2022/10/28/gats-stabilization.html">stabilized in October</a>, now undergoing various improvements!)</li>
<li>Type alias impl trait (<a href="https://github.com/rust-lang/rust/issues/63063#issuecomment-1354392317">proposed for stabilization</a>)</li>
<li>Async functions in traits and “return position impl Trait in traits” (<a href="https://blog.rust-lang.org/inside-rust/2022/11/17/async-fn-in-trait-nightly.html">static dispatch available on nightly</a>, but more work is needed)</li>
<li>Polonius (under active discussion)</li>
</ul>
<p>None of these features are “new”. They just take something that exists in Rust and let you use it more broadly. Nonetheless, I think they’re going to have a big impact, on experienced and new users alike. Experienced users can express more patterns more easily and avoid awkward workarounds. New users never have to experience the confusion that comes from typing something that feels like it <em>should</em> work, but doesn’t.</p>
<p>One other important point: <strong>Rust 2024 is just around the corner!</strong> Our goal is to get any edition changes landed on master this year, so that we can spend the next year doing finishing touches. This means we need to put in some effort to thinking ahead and planning what we can achieve.</p>
<h2 id="towards-a-rust-specification">Towards a Rust specification</h2>
<p>As Rust grows, there is increasing need for a specification. Mara had a <a href="https://blog.m-ou.se/rust-standard/">recent blog post</a> outlining some of the considerations — and especially the distinction between a <em>specification</em> and <em>standardization</em>. I don’t see the need for Rust to get involved in any standards bodies — our existing RFC and open-source process works well. But I do think that for us to continue growing out the set of people working on Rust, we need a central definition of what Rust should do, and that we need to integrate that definition into our processes more thoroughly.</p>
<p>In addition to long-standing docs like the <a href="https://doc.rust-lang.org/reference/">Rust Reference</a>, the last year has seen a number of notable efforts towards a Rust specification. The <a href="https://spec.ferrocene.dev/">Ferrocene language specification</a> is the most comprehensive, covering the grammar, name resolution, and overall functioning of the compiler. Separately, I’ve been working on a project called <a href="https://github.com/nikomatsakis/a-mir-formality">a-mir-formality</a>, which aims to be a “formal model” of Rust’s type system, including the borrow checker. And Ralf Jung has <a href="https://www.ralfj.de/blog/2022/08/08/minirust.html">MiniRust</a>, which is targeting the rules for unsafe code.</p>
<p>So what would an official Rust specification look like? Mara opened <a href="https://github.com/rust-lang/rfcs/pull/3355">RFC 3355</a>, which lays out some basic parameters. I think there are still a lot of questions to work out. Most obviously, how can we combine the existing efforts and documents? Each of them has a different focus and — as a result — a somewhat different structure. I’m hopeful that we can create a complementary whole.</p>
<p>Another important question is how to integrate the specification into our project processes. We’ve already got a rule that new language features can’t be stabilized until the reference is updated, but we’ve not always followed it, and the <a href="https://www.rust-lang.org/governance/teams/lang#lang-docs%20team">lang docs team</a> is always in need of support. There are hopeful signs here: both the Foundation and Ferrocene are interested in supporting this effort.</p>
<h2 id="unsafe-code">Unsafe code</h2>
<p>In my experience, most production users of Rust don’t touch unsafe code, which is as it should be. But almost every user of Rust relies on dependencies that do, and those dependencies are often the most critical systems.</p>
<p>At first, the idea of unsafe code seems simple. By writing <code>unsafe</code>, you gain access to new capabilities, but you take responsibility for using them correctly. But the more you look at unsafe code, the more questions come up. <a href="https://doc.rust-lang.org/nomicon/">What does it mean to use those capabilities <em>correctly</em>?</a> These questions are not just academic, they have a real impact on optimizations performed by the Rust compiler, LLVM, and even the hardware.</p>
<p>Eventually, we want to get to a place where those who author unsafe code have clear rules to follow, as well as simple tooling to test if their code violates those rules (think <code>cargo test —unsafe</code>). Authors who want more assurance than dynamic testing can provide should have access to static verifiers that can prove their crate is safe — and we should start by proving the standard library is safe.</p>
<p>We’ve been trying for some years to build that world but it’s been ridiculously hard. Lately, though, there have been some breakthroughs. Gankra’s <a href="https://github.com/rust-lang/rust/issues/95228">experiments with  <code>strict_provenance</code> APIs</a> have given some hope that we can define a relatively simple <a href="https://www.ralfj.de/blog/2022/04/11/provenance-exposed.html">provenance model</a> that will support both arbitrary unsafe code trickery and aggressive optimization, and Ralf Jung’s aforementioned <a href="https://www.ralfj.de/blog/2022/08/08/minirust.html">MiniRust</a> shows how a Rust operational semantics could look. More and more crates test with <a href="https://www.ralfj.de/blog/2022/07/02/miri.html">miri</a> to check their unsafe code, and for those who wish to go further, the <a href="https://model-checking.github.io/kani/">kani</a> verifier can check unsafe code for UB (<a href="https://rust-formal-methods.github.io/tools.html">more formal methods tooling</a> here).</p>
<p>I think we need a renewed focus on unsafe code in 2023. The first step is already underway: we are <strong>creating the <a href="https://github.com/rust-lang/rfcs/pull/3346">opsem team</a></strong>. Led by <a href="https://github.com/RalfJung">Ralf Jung</a> and <a href="https://github.com/JakobDegen">Jakob Degen</a>, the opsem team has the job of defining “the rules governing unsafe code in Rust”. It’s been clear for some time that this area requires dedicated focus, and I am hopeful that the opsem team will help to provide that.</p>
<p>I would like to see progress on <strong>dynamic verification</strong>. In particular, I think we need a tool that can handle arbitrary binaries. <a href="https://www.ralfj.de/blog/2022/07/02/miri.html">miri</a> is great, but it can’t be used to test programs that call into C code. I’d like to see something more like <a href="https://valgrind.org/">valgrind</a> or <a href="https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html">ubsan</a>, where you can test your Rust project for UB even if it’s calling into other languages through FFI.</p>
<p>Dynamic verification is great, but it is limited by the scope of your tests. To get true reliability, we need a way for unsafe code authors to do static verification. Building static verification tools today is possible but extremely painful. The compiler’s APIs are unstable and a moving target. <strong>The <a href="https://github.com/rust-lang/project-stable-mir">stable MIR</a> project proposes to change that by providing a stable set of APIs that tool authors can build on</strong>.</p>
<p>Finally, the best unsafe code is the unsafe code you don’t have to write. Unsafe code provides infinite power, but people often have simpler needs that could be made safe with enough effort. Projects like <a href="https://cxx.rs/">cxx</a> demonstrate the power of this approach. For Rust the language, <a href="https://rust-lang.github.io/rfcs/2835-project-safe-transmute.html">safe transmute</a> is the most promising such effort, and I’d like to see more of that.</p>
<h2 id="teaching-rust-in-universities">Teaching Rust in universities</h2>
<p>More and more universities are offering classes that make use of Rust, and recently many of these educators have come together in the <a href="https://rust-edu.org">Rust Edu initiative</a> to form shared teaching materials. I think this is great, and a trend we should encourage. It’s helpful for the Rust community, of course, since it means more Rust programmers. I think it’s also helpful for the students: much like learning a functional programming language, learning Rust requires incorporating different patterns and structure than other languages. I find my programs tend to be broken into smaller pieces, and the borrow checker forces me to be more thoughtful about which bits of context each function will need. Even if you wind up building your code in other languages, those new patterns will influence the way you work.</p>
<p>Stronger connections to teacher can also be a great source of data for improving Rust. If we understand better how people learn Rust and what they find difficult, we can use that to guide our priorities and look for ways to make it better. This might mean changing the language, but it might also mean changing the tooling or error messages. I’d like to see us setup some mechanism to feed insights from Rust educators, both in universities but also trainers at companies like <a href="https://ferrous-systems.com/">Ferrous Systems</a> or <a href="https://www.integer32.com/">Integer32</a>, into the Rust teams.</p>
<p>One particularly exciting effort here is the research being done at Brown University<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> by Will Crichton and Shriram Krisnamurthi. Will and Shriram have published an <a href="https://rust-book.cs.brown.edu/">interactive version of the Rust book</a> that includes quizzes. As a reader, these quizzes help you check that you understood the section. But they also provide feedback to the book authors on which sections are effective. And they allow for “A/B testing”, where you change the content of the book and see whether the quiz scores improve. Will and Shriram are also looking at other ways to deepen our understanding of how people learn Rust.</p>
<h2 id="more-insight-and-data-into-the-user-experience">More insight and data into the user experience</h2>
<p>As Rust has grown, we no longer have the obvious gaps in our user experience that there used to be (e.g., “no IDE support”). At the same time, it’s clear that the experience of Rust developers could be a lot smoother. There are a lot of great ideas of changes to make, but it’s hard to know which ones would be most effective. <strong>I would like to see a more coordinated effort to gather data on the user experience and transform it into actionable insights.</strong> Currently, the largest source of data that we have is the annual Rust survey. This is a great resource, but it only gives a very broad picture of what’s going on.</p>
<p>A few years back, the async working group collected “status quo” stories as part of its vision doc effort. These stories were immensely helpful in understanding the “async Rust user experience”, and they are still helping to shape the priorities of the async working group today. At the same time, that was a one-time effort, and it was focused on async specifically. I think that kind of effort could be useful in a number of areas.</p>
<p>I’ve already mentioned that teachers can provide one source of data. Another is simply going out and having conversations with Rust users. But I think we also need fine-grained data about the user experience. In the compiler team’s <a href="https://blog.rust-lang.org/inside-rust/2022/08/08/compiler-team-2022-midyear-report.html#compiler-team-operations-aspirations-%EF%B8%8F">mid-year report</a>, they noted (emphasis mine):</p>
<blockquote>
<p>One more thing I want to point out: five of the ambitions checked the box in the survey that said &ldquo;some of our work has reached Rust programmers, but <strong>we do not know if it has improved Rust for them.”</strong></p>
</blockquote>
<p>Right now, it’s really hard to know even basic things, like how many users are encountering compiler bugs in the wild. We have to judge that by how many comments people leave on a Github issue. Meanwhile, <a href="https://github.com/estebank">Esteban</a> personally scours twitter to find out which error messages are confusing to people.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> We should look into better ways to gather data here. I’m a fan of (opt-in, privacy preserving) telemetry, but I think there’s a discussion to be had here about the best approach. All I know is that there has to be a better way.</p>
<h2 id="maturing-our-governance">Maturing our governance</h2>
<p>In 2015, shortly after 1.0, <a href="https://rust-lang.github.io/rfcs/1068-rust-governance.html?highlight=team#">RFC 1068</a> introduced the original Rust teams: libs, lang, compiler, infra, and moderation. Each team is an independent, decision-making entity, owning one particular aspect of Rust, and operating by consensus. The “Rust core team” was given the role of knitting them together and providing a unifying vision. This structure has been a great success, but as we’ve grown, it has started to hit some limits.</p>
<p>The first limiting point has been bringing the teams together. The original vision was that team leads—along with others—would be part of a <em>core team</em> that would provide a unifying technical vision and tend to the health of the project. It’s become clear over time though that there are really different jobs. Over this year, the various Rust teams, project directors, and existing core team have come together to define a new model for project-wide governance. This effort is being driven by a <a href="https://blog.rust-lang.org/inside-rust/2022/10/06/governance-update.html">dedicated working group</a> and I am looking forward to seeing that effort come to fruition this year.</p>
<p>The second limiting point has been the need for more specialized teams. One example near and dear to my heart is the new <a href="https://github.com/rust-lang/rfcs/pull/3254">types team</a>, which is focused on type and trait system. This team has the job of diving into the nitty gritty on proposals like Generic Associated Types or impl Trait, and then surfacing up the key details for broader-based teams like lang or compiler where necessary. The aforementioned <a href="https://github.com/rust-lang/rfcs/pull/3346">opsem team</a> is another example of this sort of team. I suspect we’ll be seeing more teams like this.</p>
<p>There continues to be a need for us to grow teams that do <a href="https://smallcultfollowing.com/babysteps/blog/2019/04/15/more-than-coders/">more than coding</a>. The compiler team prioritization effort, under the leadership of <a href="https://github.com/apiraino">apiraino</a>, is a great example of a vital role that allows Rust to function but doesn’t involve landing PRs. I think there are a number of other “multiplier”-type efforts that we could use. One example would be “reporters”, i.e., people to help publish blog posts about the many things going on and spread information around the project. I am hopeful that as we get a new structure for top-level governance we can see some renewed focus and experimentation here.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Seven years since Rust 1.0 and we are still going strong. As Rust usage spreads, our focus is changing. Where once we had gaping holes to close, it’s now more a question of iterating to build on our success. But the more things change, the more they stay the same. Rust is still working to empower people to build reliable, performant programs. We still believe that building a supportive, productive tool for systems programming — one that brings more people into the “systems programming” tent — is also the best way to help the existing C and C++ programmers “hack without fear” and build the kind of systems they always wanted to build. So, what are you waiting for? Let’s get building!</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>In disclosure, AWS is a sponsor of this work.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>To be honest, Esteban will probably always do that, whatever we do.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><title type="html">Rust 2024...the year of everywhere?</title><link href="https://smallcultfollowing.com/babysteps/blog/2022/09/22/rust-2024-the-year-of-everywhere/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2022/09/22/rust-2024-the-year-of-everywhere/</id><published>2022-09-22T00:00:00+00:00</published><updated>2022-09-22T15:51:00-04:00</updated><content type="html"><![CDATA[<p>I’ve been thinking about what “Rust 2024” will look like lately. I don’t really mean the edition itself — but more like, what will Rust feel like after we’ve finished up the next few years of work? I think the answer is that Rust 2024 is going to be the year of “everywhere”. Let me explain what I mean. Up until now, Rust has had a lot of nice features, but they only work <em>sometimes</em>. By the time 2024 rolls around, they’re going to work <em>everywhere</em> that you want to use them, and I think that’s going to make a big difference in how Rust feels.</p>
<h2 id="async-everywhere">Async <em>everywhere</em></h2>
<p>Let’s start with async. Right now, you can write async functions, but not in traits. You can’t write async closures. You can’t use async drop. This creates a real hurdle. You have to learn the workarounds (e.g., the <a href="https://crates.io/crates/async-trait"><code>async-trait</code></a> crate), and in some cases, there are no proper workarounds (e.g., for async-drop).</p>
<p>Thanks to a recent PR by <a href="https://github.com/compiler-errors">Michael Goulet</a>, static async functions in traits <em>almost</em> work on nightly today! I’m confident we can work out the remaining kinks soon and start advancing the static subset (i.e., no support for dyn trait) towards stabilization.</p>
<p>The plans for dyn, meanwhile, are advancing rapidly. At this point I think we have two good options on the table and I’m hopeful we can get that nailed down and start planning what’s needed to make the implementation work.</p>
<p>Once async functions in traits work, the next steps for core Rust will be figuring out how to support async closures and async drop. Both of them add some additional challenges — particularly async drop, which has some complex interactions with other parts of the language, as Sabrina Jewson elaborated in a <a href="https://sabrinajewson.org/blog/async-drop">great, if dense, blog post</a> — but we’ve started to develop a crack team of people in the async working group and I’m confident we can overcome them.</p>
<p>There is also library work, most notably settling on some interop traits, and defining ways to write code that is portable across allocators. I would like to see more exploration of structured concurrency<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>, as well, or other alternatives to <code>select!</code> like the <a href="https://blog.yoshuawuyts.com/futures-concurrency-3/#concurrent-stream-processing-with-stream-merge">stream merging pattern</a> Yosh has been advocating for.</p>
<p>Finally, for extra credit, I would love to see us integrate async/await keywords into other bits of the function body, permitting you to write common patterns more easily. Yoshua Wuyts has had a really interesting series of blog posts exploring these sorts of ideas. I think that being able to do <code>for await x in y</code> to iterate, or <code>(a, b).await</code> as a form of join, or <code>async let x = …</code> to create a future in a really lightweight way could be great.</p>
<h2 id="impl-trait-everywhere">Impl trait <em>everywhere</em></h2>
<p>The <code>impl Trait</code> notation is one of Rust’s most powerful conveniences, allowing you to omit specific types and instead talk about the interface you need. Like async, however, impl Trait can only be used in inherent functions and methods, and can’t be used for return types in traits, nor can it be used in type aliases, let bindings, or any number of other places it might be useful.</p>
<p>Thanks to <a href="https://github.com/oli-obk">Oli Scherer</a>’s hard work over the last year, we are nearing stabilization for impl Trait in type aliases. Oli’s work has also laid the groundwork to support impl trait in let bindings, meaning that you will be able to do something like</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">iter</span>: <span class="nc">impl</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kt">i32</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="mi">0</span><span class="o">..</span><span class="mi">10</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">//        ^^^^^^^^^^^^^ Declare type of `iter` to be “some iterator”.
</span></span></span></code></pre></div><p>Finally, the same PR that added support for async fns in traits also added initial support for return-position impl trait in traits. Put it all together, and we are getting very close the letting you use impl trait everywhere you might want to.</p>
<p>There is still at least one place where <code>impl Trait</code> is not accepted that I think it should be, which is nested in other positions. I’d like you to be able to write <code>impl Fn(impl Debug)</code>, for example, to refer to “some closure that takes an argument of type <code>impl Debug</code>” (i.e., can be invoked multiple times with different debug types).</p>
<h2 id="generics-everywhere">Generics <em>everywhere</em></h2>
<p>Generic types are a big part of how Rust libraries are built, but Rust doesn’t allow people to write generic parameters in all the places they would be useful, and limitations in the compiler prevent us from making full use of the annotations we do have.</p>
<p>Not being able to use generic types everywhere might seem abstract, particularly if you’re not super familiar with Rust. And indeed, for a lot of code, it’s not a big deal. But if you’re trying to write libraries, or to write one common function that will be used all over your code base, then it can quickly become a huge blocker. Moreover, given that Rust supports generic types in many places, the fact that we don’t support them in <em>some</em> places can be really confusing — people don’t realize that the reason their idea doesn’t work is not because the idea is wrong, it’s because the language (or, often, the compiler) is limited.</p>
<p>The biggest example of generics everywhere is <em>generic associated types</em>. Thanks to hard work by <a href="https://github.com/jackh726/">Jack Huey</a>, <a href="https://github.com/MatthewJasper/">Matthew Jasper</a>, and a <a href="https://blog.rust-lang.org/2021/08/03/GATs-stabilization-push.html#why-has-it-taken-so-long-to-implement-this">number of others</a>, this feature is very close to hitting stable Rust — in fact, it is in the current beta, and should be available in 1.65. One caveat, though: the upcoming support for GATs has a number of known limitations and shortcomings, and it gives some pretty confusing errors. It’s still really useful, and a lot of people are already using it on nightly, but it’s going to require more attention before it lives up to its full potential.</p>
<p>You may not wind up using GATs in your code, but it will definitely be used in some of the libraries you rely on. GATs directly enables common patterns like <code>Iterable</code> that have heretofore been inexpressible, but we’ve also seen a lot of examples where its used internally to help libraries present a more unified, simpler interface to their users.</p>
<p>Beyond GATs, there are a number of other places where we could support generics, but we don’t. In the previous section, for example, I talked about being able to have a function with a parameter like <code>impl Fn(impl Debug)</code> — this is actually an example of a “generic closure”. That is, a closure that itself has generic arguments. Rust doesn’t support this yet, but there’s no reason we can’t.</p>
<p>Oftentimes, though, the work to realize “generics everywhere” is not so much a matter of extending the language as it is a matter of improving the compiler’s implementation. Rust’s current traits implementation works pretty well, but as you start to push the bounds of it, you find that there are lots of places where it could be smarter. A lot of the ergonomic problems in GATs arise exactly out of these areas.</p>
<p>One of the developments I’m most excited about in Rust is not any particular feature, it’s the formation of the new <a href="https://github.com/rust-lang/types-team">types team</a>. The goal of this team is to revamp the compiler’s trait system implementation into something efficient and extensible, as well as building up a core set of contributors.</p>
<h2 id="making-rust-feel-simpler-by-making-it-more-uniform">Making Rust feel simpler by making it more uniform</h2>
<p>The topics in this post, of course, only scratch the surface of what’s going on in Rust right now. For example, I’m really excited about “everyday niceties” like let/else-syntax and if-let-pattern guards, or the scoped threads API that we got in 1.63. There are exciting conversations about ways to improve error messages. Cargo, the compiler, and rust-analyzer are all generally getting faster and more capable. And so on, and so on.</p>
<p>The pattern of having a feature that starts working <em>somewhere</em> and then extending it so that it works <em>everywhere</em> seems, though, to be a key part of how Rust development works. It’s inspiring also because it becomes a win-win for users. Newer users find Rust easier to use and more consistent; they don’t have to learn the “edges” of where one thing works and where it doesn’t. Experienced users gain new expressiveness and unlock patterns that were either awkward or impossible before.</p>
<p>One challenge with this iterative development style is that sometimes it takes a long time. Async functions, impl Trait, and generic reasoning are three areas where progress has been stalled for years, for a variety of reasons. That’s all started to shift this year, though. A big part of is the formation of new Rust teams at many companies, allowing a lot more people to have a lot more time. It’s also just the accumulation of the hard work of many people over a long time, slowly chipping away at hard problems (to get a sense for what I mean, read <a href="https://jackh726.github.io/rust/2022/06/10/nll-stabilization.html#how-did-we-get-here">Jack’s blog post on NLL removal</a>, and take a look at the full list of contributors he cited there — just assembling the list was impressive work, not to mention the actual work itself).</p>
<p>It may have been a long time coming, but I’m really excited about where Rust is going right now, as well as the new crop of contributors that have started to push the compiler faster and faster than it’s ever moved before. If things continue like this, Rust in 2024 is going to be pretty damn great.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Oh, my beloved <a href="https://github.com/nikomatsakis/moro">moro</a>! I will return to thee!&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><title type="html">Dyn async traits, part 9: call-site selection</title><link href="https://smallcultfollowing.com/babysteps/blog/2022/09/21/dyn-async-traits-part-9-callee-site-selection/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2022/09/21/dyn-async-traits-part-9-callee-site-selection/</id><published>2022-09-21T00:00:00+00:00</published><updated>2022-09-21T17:35:00-04:00</updated><content type="html"><![CDATA[<p>After my last post on dyn async traits, some folks pointed out that I was overlooking a seemingly obvious possibility. Why not have the choice of how to manage the future be made at the call site? It&rsquo;s true, I had largely dismissed that alternative, but it&rsquo;s worth consideration. This post is going to explore what it would take to get call-site-based dispatch working, and what the ergonomics might look like. I think it&rsquo;s actually fairly appealing, though it has some limitations.</p>
<h2 id="if-we-added-support-for-unsized-return-values">If we added support for unsized return values&hellip;</h2>
<p>The idea is to build on the mechanisms proposed in <a href="https://github.com/rust-lang/rfcs/pull/2884">RFC 2884</a>. With that RFC, you would be able to have functions that returned a <code>dyn Future</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">return_dyn</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">dyn</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">()</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">move</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Normally, when you call a function, we can allocate space on the stack to store the return value. But when you call <code>return_dyn</code>, we don&rsquo;t know how much space we need at compile time, so we can&rsquo;t do that<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>. This means you can&rsquo;t just write <code>let x = return_dyn()</code>. Instead, you have to choose how to allocate that memory. Using the APIs proposed in <a href="https://github.com/rust-lang/rfcs/pull/2884">RFC 2884</a>, the most common option would be to store it on the heap. A new method, <code>Box::new_with</code>, would be added to <code>Box</code>; it acts like <code>new</code>, but it takes a closure, and the closure can return values of any type, including <code>dyn</code> values:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">result</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Box</span>::<span class="n">new_with</span><span class="p">(</span><span class="o">||</span><span class="w"> </span><span class="n">return_dyn</span><span class="p">());</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// result has type `Box&lt;dyn Future&lt;Output = ()&gt;&gt;`
</span></span></span></code></pre></div><p>Invoking <code>new_with</code> would be ergonomically unpleasant, so we could also add a <code>.box</code> operator. Rust has had an unstable <code>box</code> operator since forever, this might finally provide enough motivation to make it worth adding:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">result</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">return_dyn</span><span class="p">().</span><span class="k">box</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// result has type `Box&lt;dyn Future&lt;Output = ()&gt;&gt;`
</span></span></span></code></pre></div><p>Of course, you wouldn&rsquo;t <em>have</em> to use <code>Box</code>. Assuming we have sufficient APIs available, people can write their own methods, such as something to do arena allocation&hellip;</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">arena</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Arena</span>::<span class="n">new</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">result</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">arena</span><span class="p">.</span><span class="n">new_with</span><span class="p">(</span><span class="o">||</span><span class="w"> </span><span class="n">return_dyn</span><span class="p">());</span><span class="w">
</span></span></span></code></pre></div><p>&hellip;or perhaps a hypothetical <code>maybe_box</code>, which would use a buffer if that&rsquo;s big enough, and use box otherwise:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">big_buf</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">[</span><span class="mi">0</span><span class="p">;</span><span class="w"> </span><span class="mi">1024</span><span class="p">];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">result</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">maybe_box</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">big_buf</span><span class="p">,</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="n">return_dyn</span><span class="p">()).</span><span class="k">await</span><span class="p">;</span><span class="w">
</span></span></span></code></pre></div><p>If we add <a href="https://github.com/rust-lang/rfcs/pull/2442">postfix macros</a>, then we might even support something like <code>return_dyn.maybe_box!(&amp;mut big_buf)</code>, though I&rsquo;m not sure if the current proposal would support that or not.</p>
<h2 id="what-are-unsized-return-values">What are unsized return values?</h2>
<p>This idea of returning <code>dyn Future</code> is sometimes called &ldquo;unsized return values&rdquo;, as functions can now return values of &ldquo;unsized&rdquo; type (i.e., types who size is not statically known). They&rsquo;ve been proposed in <a href="https://github.com/rust-lang/rfcs/pull/2884">RFC 2884</a> by <a href="https://github.com/PoignardAzur">Olivier Faure</a>, and I believe there were some earlier RFCs as well. The <code>.box</code> operator, meanwhile, has been a part of &ldquo;nightly Rust&rdquo; since approximately forever, though its currently written in prefix form, i.e., <code>box foo</code><sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>.</p>
<p>The primary motivation for both unsized-return-values and <code>.box</code> has historically been efficiency: they permit in-place initialization in cases where it is not possible today. For example, if I write <code>Box::new([0; 1024])</code> today, I am technically allocating a <code>[0; 1024]</code> buffer on the stack and then copying it into the box:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// First evaluate the argument, creating the temporary:
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">temp</span>: <span class="p">[</span><span class="kt">u8</span><span class="p">;</span><span class="w"> </span><span class="mi">1024</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">..</span><span class="p">.;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// Then invoke `Box::new`, which allocates a Box...
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">box</span>: <span class="o">*</span><span class="k">const</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">allocate_memory</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// ...and copies the memory in.
</span></span></span><span class="line"><span class="cl"><span class="n">std</span>::<span class="n">ptr</span>::<span class="n">write</span><span class="p">(</span><span class="k">box</span><span class="p">,</span><span class="w"> </span><span class="n">temp</span><span class="p">);</span><span class="w">
</span></span></span></code></pre></div><p>The optimizer may be able to fix that, but it&rsquo;s not trivial. If you look at the order of operations, it requires making the allocation happen <em>before</em> the arguments are allocated. LLVM considers calls to known allocators to be &ldquo;side-effect free&rdquo;, but promoting them is still risky, since it means that more memory is allocated earlier, which can lead to memory exhaustion. The point isn&rsquo;t so much to look at exactly what optimizations LLVM will do in practice, so much as to say that it is not trivial to optimize away the temporary: it requires some thoughtful heuristics.</p>
<h2 id="how-would-unsized-return-values-work">How would unsized return values work?</h2>
<p>This merits a blog post of its own, and I won&rsquo;t dive into details. For our purposes here, the key point is that somehow when the callee goes to return its final value, it can use whatever strategy the caller prefers to get a return point, and write the return value directly in there. <a href="https://github.com/rust-lang/rfcs/pull/2884">RFC 2884</a> proposes one solution based on generators, but I would want to spend time thinking through all the alternatives before we settled on something.</p>
<h2 id="using-dynamic-return-types-for-async-fn-in-traits">Using dynamic return types for async fn in traits</h2>
<p>So, the question is, can we use <code>dyn</code> return types to help with async function in traits? Continuing with my example from my previous post, if you have an <code>AsyncIterator</code> trait&hellip;</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">AsyncIterator</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>&hellip;the idea is that calling <code>next</code> on a <code>dyn AsyncIterator</code> type would yield <code>dyn Future&lt;Output = Option&lt;Self::Item&gt;&gt;</code>. Therefore, one could write code like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">use_dyn</span><span class="p">(</span><span class="n">di</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="k">dyn</span><span class="w"> </span><span class="n">AsyncIterator</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">di</span><span class="p">.</span><span class="n">next</span><span class="p">().</span><span class="k">box</span><span class="p">.</span><span class="k">await</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//       ^^^^
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The expression <code>di.next()</code> by itself yields a <code>dyn Future</code>. This type is not sized and so it won&rsquo;t compile on its own. Adding <code>.box</code> produces a <code>Box&lt;dyn AsyncIterator&gt;</code>, which you can then await.<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></p>
<p>Compared to the <code>Boxing</code> adapter I discussed before, this is relatively straightforward to explain. I&rsquo;m not entirely sure which is more convenient to use in practice: it depends how many <code>dyn</code> values you create and how many methods you call on them. Certainly you can work around the problem of having to write <code>.box</code> at each call-site via wrapper types or helper methods that do it for you.</p>
<h2 id="complication-dyn-asynciterator-does-not-implement-asynciterator">Complication: <code>dyn AsyncIterator</code> does not implement <code>AsyncIterator</code></h2>
<p>There is one complication. Today in Rust, every <code>dyn Trait</code> type also implements <code>Trait</code>. But can <code>dyn AsyncIterator</code> implement <code>AsyncIterator</code>? In fact, it cannot! The problem is that the <code>AsyncIterator</code> trait defines <code>next</code> as returning <code>impl Future&lt;..&gt;</code>, which is actually shorthand for <code>impl Future&lt;..&gt; + Sized</code>, but we said that <code>next</code> would return <code>dyn Future&lt;..&gt;</code>, which is <code>?Sized</code>. So the <code>dyn AsyncIterator</code> type doesn&rsquo;t meet the bounds the trait requires. Hmm.</p>
<h2 id="butdoes-dyn-asynciterator-have-to-implement-asynciterator">But&hellip;does <code>dyn AsyncIterator</code> have to implement <code>AsyncIterator</code>?</h2>
<p>There is no &ldquo;hard and fixed&rdquo; reason that <code>dyn Trait</code> types have to implement <code>Trait</code>, and there are a few good reasons <em>not</em> to do it. The alternative to dyn safety is a design like this: you can <em>always</em> create a <code>dyn Trait</code> value for any <code>Trait</code>, but you may not be able to use all of its members. For example, given a <code>dyn Iterator</code>, you could call <code>next</code>, but you couldn&rsquo;t call generic methods like <code>map</code>. In fact, we&rsquo;ve kind of got this design in practice, thanks to the <a href="https://rust-lang.github.io/rfcs/0255-object-safety.html#adding-a-where-clause"><code>where Self: Sized</code> hack</a> that lets us exclude methods from being used on <code>dyn</code> values.</p>
<p>Why did we adopt object safety in the first place? If you look back at <a href="https://rust-lang.github.io/rfcs/0255-object-safety.html">RFC 255</a>, the primary motivation for this rule was ergonomics: clearer rules and better error messages. Although I argued for <a href="https://rust-lang.github.io/rfcs/0255-object-safety.html">RFC 255</a> at the time, I don&rsquo;t think these motivations have aged so well. Right now, for example, if you have a trait with a generic method, you get an error when you try to create a <code>dyn Trait</code> value, telling you that you cannot create a <code>dyn Trait</code> from a trait with a generic method. But it may well be clearer to get an error at the point where you to call that generic method telling you that you cannot call generic methods through <code>dyn Trait</code>.</p>
<p>Another motivation for having <code>dyn Trait</code> implement <code>Trait</code> was that one could write a generic function with <code>T: Trait</code> and have it work equally well for object types. That capability <em>is</em> useful, but because you have to write <code>T: ?Sized</code> to take advantage of it, it only really works if you plan carefully. In practice what I&rsquo;ve found works much better is to implement <code>Trait</code> to <code>&amp;dyn Trait</code>.</p>
<h2 id="what-would-it-mean-to-remove-the-rule-that-dyn-asynciterator-asynciterator">What would it mean to remove the rule that <code>dyn AsyncIterator: AsyncIterator</code>?</h2>
<p>I think the new system would be something like this&hellip;</p>
<ul>
<li>You can always<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup> create a <code>dyn Foo</code> value. The <code>dyn Foo</code> type would define inherent methods based on the trait <code>Foo</code> that use dynamic dispatch, but with some changes:
<ul>
<li>Async functions and other methods defined with <code>-&gt; impl Trait</code> return <code>-&gt; dyn Trait</code> instead.</li>
<li>Generic methods, methods referencing <code>Self</code>, and other such cases are excluded. These cannot be handled with virtual dispatch.</li>
</ul>
</li>
<li>If <code>Foo</code> is <a href="https://doc.rust-lang.org/reference/items/traits.html#object-safety">object safe</a> using today&rsquo;s rules, <code>dyn Foo: Foo</code> holds. Otherwise, it does not.<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup>
<ul>
<li>On a related but orthogonal note, I would like to make a <code>dyn</code> keyword required to declare dyn safety.</li>
</ul>
</li>
</ul>
<h2 id="implications-of-removing-that-rule">Implications of removing that rule</h2>
<p>This implies that <code>dyn AsyncIterator</code> (or any trait with async functions/RPITIT<sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup>) will not implement <code>AsyncIterator</code>. So if I write this function&hellip;</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">use_any</span><span class="o">&lt;</span><span class="n">I</span><span class="o">&gt;</span><span class="p">(</span><span class="n">x</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">I</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">I</span>: <span class="o">?</span><span class="nb">Sized</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">AsyncIterator</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">x</span><span class="p">.</span><span class="n">next</span><span class="p">().</span><span class="k">await</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>&hellip;I cannot use it with <code>I = dyn AsyncIterator</code>. You can see why: it calls <code>next</code> and assumes the result is <code>Sized</code> (as promised by the trait), so it doesn&rsquo;t add any kind of <code>.box</code> directive (and it shouldn&rsquo;t have to).</p>
<p>What you <em>can</em> do is implement a wrapper type that encapsulates the boxing:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">BoxingAsyncIterator</span><span class="o">&lt;</span><span class="na">&#39;i</span><span class="p">,</span><span class="w"> </span><span class="n">I</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">iter</span>: <span class="kp">&amp;</span><span class="na">&#39;i</span> <span class="nc">mut</span><span class="w"> </span><span class="k">dyn</span><span class="w"> </span><span class="n">AsyncIterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">I</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">I</span><span class="o">&gt;</span><span class="w"> </span><span class="n">AsyncIterator</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">BoxingAsyncIterator</span><span class="o">&lt;</span><span class="na">&#39;i</span><span class="p">,</span><span class="w"> </span><span class="n">I</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">I</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">iter</span><span class="p">.</span><span class="n">next</span><span class="p">().</span><span class="k">box</span><span class="p">.</span><span class="k">await</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>&hellip;and then you can call <code>use_any(BoxingAsyncIterator::new(ai))</code>.<sup id="fnref:7"><a href="#fn:7" class="footnote-ref" role="doc-noteref">7</a></sup></p>
<h2 id="limitation-what-if-you-wanted-to-do-stack-allocation">Limitation: what if you wanted to do stack allocation?</h2>
<p>One of the goals with the <a href="https://smallcultfollowing.com/babysteps/
/blog/2022/09/18/dyn-async-traits-part-8-the-soul-of-rust/">previous proposal</a> was to allow you to write code that used <code>dyn AsyncIterator</code> which worked equally well in std and no-std environments. I would say that goal was partially achieved. The core idea was that the caller would choose the strategy by which the future got allocated, and so it could opt to use inline allocation (and thus be no-std compatible) or use boxing (and thus be simple).</p>
<p>In this proposal, the call-site has to choose. You might think then that you could just choose to use stack allocation at the call-site and thus be no-std compatible. But how does one choose stack allocation? It&rsquo;s actually quite tricky! Part of the problem is that async stack frames are stored in structs, and thus we cannot support something like <code>alloca</code> (at least not for values that will be live across an await, which includes any future that is awaited<sup id="fnref:8"><a href="#fn:8" class="footnote-ref" role="doc-noteref">8</a></sup>). In fact, even outside of async, using alloca is quite hard! The problem is that a stack is, well, a stack. Ideally, you would do the allocation just before your callee returns, but that&rsquo;s when you know how much memory you need. But at that time, your callee is still using the stack, so your allocation is on the wrong spot.<sup id="fnref:9"><a href="#fn:9" class="footnote-ref" role="doc-noteref">9</a></sup> I personally think we should just rule out the idea of using alloca to do stack allocation.</p>
<p>If we can&rsquo;t use alloca, what can we do? We have a few choices. In the very beginning, I talked about the idea of a <code>maybe_box</code> function that would take a buffer and use it only for really large values. That&rsquo;s kind of nifty, but it still relies on a box fallback, so it doesn&rsquo;t really work for no-std.<sup id="fnref:10"><a href="#fn:10" class="footnote-ref" role="doc-noteref">10</a></sup> Might be a nice alternative to <a href="https://twitter.com/theinedibleholk/status/1557802452069388288">stackfuture</a> though!<sup id="fnref:11"><a href="#fn:11" class="footnote-ref" role="doc-noteref">11</a></sup></p>
<p>You can also achieve inlining by writing wrapper types (<a href="https://github.com/nikomatsakis/dyner/blob/8086d4a16f68a2216ddff5c03c8c5b3d94ed93a2/src/dyn_async_iter.rs#L4-L6">something tmandry and I prototyped some time back</a>), but the challenge then is that your callee doesn&rsquo;t accept a <code>&amp;mut dyn AsyncIterator</code>, it accepts something like <code>&amp;mut DynAsyncIter</code>, where <code>DynAsyncIter</code> is a struct that you defined to do the wrapping.</p>
<p><strong>All told, I think the answer in reality would be: If you want to be used in a no-std environment, you don&rsquo;t use <code>dyn</code> in your public interfaces. Just use <code>impl AsyncIterator</code>. You can use hacks like the wrapper types internally if you really want dynamic dispatch.</strong></p>
<h2 id="question-how-much-room-is-there-for-the-compiler-to-get-clever">Question: How much room is there for the compiler to get clever?</h2>
<p>One other concern I had in thinking about this proposal was that it seemed like it was <em>overspecified</em>. That is, the vast majority of call-sites in this proposal will be written with <code>.box</code>, which thus specifies that they should allocate a box to store the result. But what about ideas like caching the box across invocations, or &ldquo;best effort&rdquo; stack allocation? Where do they fit in? From what I can tell, those optimizations are still possible, so long as the <code>Box</code> which would be allocated doesn&rsquo;t escape the function (which was the same condition we had before).</p>
<p>The way to think of it: by writing <code>foo().box.await</code>, the user told us to use the boxing allocator to box the return value of <code>foo</code>. But we can then see that this result is passed to await, which takes ownership and later frees it. We can thus decide to substitute a different allocator, perhaps one that reuses the box across invocations, or tries to use stack memory; this is fine so long as we modifed the freeing code to match. Doing this relies on knowing that the allocated value is immediately returned to us and that it never leaves our control.</p>
<h2 id="conclusion">Conclusion</h2>
<p>To sum up, I think for most users this design would work like so&hellip;</p>
<ul>
<li>You can use <code>dyn</code> with traits that have async functions, but you have to write <code>.box</code> every time you call a method.</li>
<li>You get to use <code>.box</code> in other places too, and we gain at least <em>some</em> support for unsized return values.<sup id="fnref:12"><a href="#fn:12" class="footnote-ref" role="doc-noteref">12</a></sup></li>
<li>If you want to write code that is sometimes using dyn and sometimes using static dispatch, you&rsquo;ll have to write some awkward wrapper types.<sup id="fnref:13"><a href="#fn:13" class="footnote-ref" role="doc-noteref">13</a></sup></li>
<li>If you are writing no-std code, use <code>impl Trait</code>, not <code>dyn Trait</code>; if you must use <code>dyn</code>, it&rsquo;ll require wrapper types.</li>
</ul>
<p>Initially, I dismissed call-site allocation because it violated <code>dyn Trait: Trait</code> and it didn&rsquo;t allow code to be written with <code>dyn</code> that could work in both std and no-std. But I think that violating <code>dyn Trait: Trait</code> may actually be good, and I&rsquo;m not sure how important that latter constraint truly is. Furthermore, I think that <code>Boxing::new</code> and the various &ldquo;dyn adapters&rdquo; are probably going to be pretty confusing for users, but writing <code>.box</code> on a call-site is relatively easy to explain (&ldquo;we don&rsquo;t know what future you need, so you have to box it&rdquo;). So now it seems a lot more appealing to me, and I&rsquo;m grateful to <a href="https://github.com/PoignardAzur">Olivier Faure</a> for bringing it up again.</p>
<p>One possible extension would be to permit users to specify the type of each returned future in some way. As I was finishing up this post, I saw that <a href="https://internals.rust-lang.org/t/blog-series-dyn-async-in-traits-continues/17403/50?u=nikomatsakis">matthieum posted an intriguing idea</a> in this direction on the internals thread. In general, I do see a need for some kind of &ldquo;trait adapters&rdquo;, such that you can take a base trait like <code>Iterator</code> and &ldquo;adapt&rdquo; it in various ways, e.g. producing a version that uses async methods, or which is const-safe. This has some pretty heavy overlap with the whole <a href="https://blog.yoshuawuyts.com/announcing-the-keyword-generics-initiative/">keyword generics</a> initiative too. I think it&rsquo;s a good extension to think about, but it wouldn&rsquo;t be part of the &ldquo;MVP&rdquo; that we ship first.</p>
<h2 id="thoughts">Thoughts?</h2>
<p>Please leave comments in <a href="https://internals.rust-lang.org/t/blog-series-dyn-async-in-traits-continues/17403/40">this internals thread</a>, thanks!</p>
<h2 id="appendix-a-the-output-associated-type">Appendix A: the <code>Output</code> associated type</h2>
<p>Here is an interesting thing! The <code>FnOnce</code> trait, implemented by all callable things, defines its associated type <a href="https://doc.rust-lang.org/std/ops/trait.FnOnce.html#associatedtype.Output"><code>Output</code></a> as <code>Sized</code>! We have to change this if we want to allow unsized return values.</p>
<p>In theory, this could be a big backwards compatibility hazard. Code that writes <code>F::Output</code> can assume, based on the trait, that the return value is sized &ndash; so if we remove that bound, the code will no longer build!</p>
<p>Fortunately, I think this is ok. We&rsquo;ve deliberately restricted the fn types so you can only use them with the <code>()</code> notation, e.g., <code>where F: FnOnce()</code> or <code>where F: FnOnce() -&gt; ()</code>. Both of these forms expand to something which explicitly specifies <code>Output</code>, like <code>F: FnOnce&lt;(), Output = ()&gt;</code>. What this means is that even if you really generic code&hellip;</p>
<pre tabindex="0"><code class="language-rust=" data-lang="rust=">fn foo&lt;F, R&gt;(f: F)
where
    F: FnOnce&lt;Output = R&gt;
{
    let value: F::Output = f();
    ...
}
</code></pre><p>&hellip;when you write <code>F::Output</code>, that is actually normalized to <code>R</code>, and the type <code>R</code> has its own (implicit) <code>Sized</code> bound.</p>
<p>(There&rsquo;s was actually a recent unsoundness related to this bound, <a href="https://github.com/rust-lang/rust/pull/100096">closed by this PR</a>, and we <a href="https://rust-lang.zulipchat.com/#narrow/stream/326866-t-types.2Fnominated/topic/.23100096.3A.20a.20fn.20pointer.20doesn't.20implement.20.60Fn.60.2F.60FnMut.60.2F.60FnOnc.E2.80.A6/near/297797248">discussed exactly this forwards compatibility question on Zulip.</a>)</p>
<h2 id="footnotes">Footnotes</h2>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>I can hear you now: &ldquo;but what about alloca!&rdquo; I&rsquo;ll get there.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>The <code>box foo</code> operator supported by the compiler has no current path to stabilization. There were earlier plans (see <a href="https://github.com/rust-lang/rfcs/pull/809">RFC 809</a> and <a href="https://rust-lang.github.io/rfcs/1228-placement-left-arrow.html">RFC 1228</a>), but we ultimately abandoned those efforts. Part of the problem, in fact, was that the precedence of <code>box foo</code> made for bad ergonomics: <code>foo.box</code> works much better.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>If you try to await a <code>Box&lt;dyn Future&gt;</code> today, you <a href="https://play.rust-lang.org/?version=stable&amp;mode=debug&amp;edition=2021&amp;gist=b981b7eafee70cc39f70176f6b135023">get an error that it needs to be pinned</a>. I think we can solve that by implementing <code>IntoFuture</code> for <code>Box&lt;dyn Future&gt;</code> and having that convert it to <code>Pin&lt;Box&lt;dyn Future&gt;&gt;</code>.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>Or almost always? I may be overlooking some edge cases.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>Internally in the compiler, this would require modifying the definition of MIR to make &ldquo;dyn dispatch&rdquo; more first-class.&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p>Don&rsquo;t know what RPITIT stands for?! &ldquo;Return position impl trait in traits!&rdquo; Get with the program!&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:7">
<p>This is basically what the &ldquo;magical&rdquo; <code>Boxing::new</code> would have done for you in the older proposal.&#160;<a href="#fnref:7" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:8">
<p><a href="https://internals.rust-lang.org/t/blog-series-dyn-async-in-traits-continues/17403/52?u=nikomatsakis">Brief explanation of why async and alloca don&rsquo;t mix here.</a>&#160;<a href="#fnref:8" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:9">
<p>I was told Ada compiles will allocate the memory at the top of the stack, copy it over to the start of the function&rsquo;s area, and then pop what&rsquo;s left. Theoretically possible!&#160;<a href="#fnref:9" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:10">
<p>You could imagine a version that aborted the code if the size is wrong, too, which would make it no-std safe, but not in a realiable way (aborts == yuck).&#160;<a href="#fnref:10" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:11">
<p>Conceivably you could set the size to <code>size_of(SomeOtherType)</code> to automatically determine how much space is needed.&#160;<a href="#fnref:11" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:12">
<p>I say <em>at least some</em> because I suspect many details of the more general case would remain unstable until we gain more experience.&#160;<a href="#fnref:12" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:13">
<p>You have to write awkward wrapper types <em>for now</em>, anyway. I&rsquo;m intrigued by ideas about how we could make that more automatic, but I think it&rsquo;s way out of scope here.&#160;<a href="#fnref:13" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/dyn-async-traits" term="dyn-async-traits" label="Dyn async traits"/></entry><entry><title type="html">What I meant by the "soul of Rust"</title><link href="https://smallcultfollowing.com/babysteps/blog/2022/09/19/what-i-meant-by-the-soul-of-rust/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2022/09/19/what-i-meant-by-the-soul-of-rust/</id><published>2022-09-19T00:00:00+00:00</published><updated>2022-09-19T10:15:00-04:00</updated><content type="html"><![CDATA[<p>Re-reading my <a href="https://smallcultfollowing.com/babysteps/blog/2022/09/18/dyn-async-traits-part-8-the-soul-of-rust/">previous post</a>, I felt I should clarify why I called it the “soul of Rust”. The soul of Rust, to my mind, is definitely <strong>not</strong> being explicit about allocation. Rather, it’s about the struggle between a few key values — especially <em>productivity</em> and <em>versatility</em><sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> in tension with <em>transparency</em>. Rust’s goal has always been to <em>feel</em> like a high-level but with the performance and control of a <em>low-level</em> one. Oftentimes, we are able to find a <a href="https://smallcultfollowing.com/babysteps/blog/2019/04/19/aic-adventures-in-consensus/">“third way” that removes the tradeoff</a>, solving both goals pretty well. But finding those “third ways” takes time — and sometimes we just have to accept a certain hit to one value or another for the time being to make progress. It’s exactly at these times, when we have to make a difficult call, that questions about the “soul of Rust” starts to come into play. I’ve been thinking about this a lot, so I thought I would write a post that expands on the role of transparency in Rust, and some of the tensions that arise around it.</p>
<h2 id="why-do-we-value-transparency">Why do we value transparency?</h2>
<p>From the <a href="https://rustacean-principles.netlify.app/how_rust_empowers/transparent.html">draft Rustacean Principles</a>:</p>
<blockquote>
<p>🔧 Transparent: &ldquo;you can predict and control low-level details&rdquo;</p>
</blockquote>
<p>The C language, famously, maps quite closely to how machines typically operate. So much so that people have sometimes called it “portable assembly”.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> Both C++ and Rust are trying to carry on that tradition, but to add on higher levels of abstraction. Inevitably, this leads to tension. Operator overloading, for example, makes figuring out what <code>a + b</code> more difficult.<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></p>
<h2 id="transparency-gives-you-control">Transparency gives you control</h2>
<p>Transparency doesn’t automatically give high performance, but it does give control. This helps when crafting your system, since you can set it up to do what you want, but it also helps when analyzing its performance or debugging. There’s nothing more frustrating than starting at code for hours and hours only to realize that the source of your problem isn’t anywhere in the code you can see — it lies in some invisible interaction that wasn’t made explicit.</p>
<h2 id="transparency-can-cost-performance">Transparency can cost performance</h2>
<p>The flip-side of transparency is overspecification. The more directly your program maps to assembly, the less room the compiler and runtime have to do clever things, which can lead to lower performance. In Rust, we are always looking for places where we can be <em>less</em> transparent in order to gain performance — but only up to a point. One example is struct layout: the Rust compiler retains the freedom to reorder fields in a struct, enabling us to make more compact data structures. That’s less transparent than C, but usually not in a way that you care about. (And, of course, if you want to specify the order of your fields, we offer the <code>#[repr]</code> attribute.)</p>
<h2 id="transparency-hurts-versatility-and-productivity">Transparency hurts versatility and productivity</h2>
<p>The bigger price of transparency, though, is versatility. It forces everyone to care about low-level details that may not actually matter to the problem at hand<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup>. Relevant to dyn async trait, most async Rust systems, for example, perform allocations left and right. The fact that a particular call to an async function might invoke <code>Box::new</code> is unlikely to be a performance problem. For those users, selecting a <code>Boxing</code> adapter adds to the overall complexity they have to manage for very little gain. If you’re working on a project where you don’t <em>need</em> peak performance, that’s going to make Rust less appealing than other languages. I’m not saying that’s <em>bad</em>, but it’s a fact.</p>
<h2 id="a-zero-sum-situation">A zero-sum situation…</h2>
<p>At this moment in the design of async traits, we are struggling with a core question here of “how versatile can Rust be”. Right now, it feels like a “zero sum situation”. We can add in something like <code>Boxing::new</code> to preserve transparency, but it’s going to cost us some in versatility — hopefully not too much.</p>
<h2 id="for-now">…for now?</h2>
<p>I do wonder, though, if there’s a “third way” waiting somewhere. I hinted at this a bit in the previous post. At the moment, I don’t know what that third way is, and I think that requiring an explicit adapter is the most practical way forward. But it seems to me that it’s not a perfect sweet spot yet, and I am hopeful we’ll be able to subsume it into something more general.</p>
<p>Some ingredients that might lead to a ‘third way’:</p>
<ul>
<li><em>With-clauses or capabilities:</em> I am intrigued by the idea of [with-clauses] and the general idea of scoped capabilities. We might be able to think about the “default adapter” as something that gets specified via a with-clause?</li>
<li><em>Const evaluation:</em> One of the niftier uses for const evaluation is for “meta-programming” that customizes how Rust is compiled. For example, we could potentially let you write a <code>const fn</code> that creates the vtable data structure for a given trait.</li>
<li><em>Profiles and portability:</em> Can we find a better way to identify the kinds of transparency that you want, perhaps via some kind of ‘profiles’? I feel we already have ‘de facto’ profiles right now, but we don’t recognize them. “No std” is a clear example, but another would be the set of operating systems or architectures that you try to support. Recognizing that different users have different needs, and giving people a way to choose which one fits them best, might allow us to be more supportive of all our users — but then again, it might make it make Rust “modal” and more confusing.</li>
</ul>
<h3 id="comments">Comments?</h3>
<p>Please leave comments in <a href="https://internals.rust-lang.org/t/blog-series-dyn-async-in-traits-continues/17403">this internals thread</a>. Thanks!</p>
<h2 id="footnotes">Footnotes</h2>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>I didn’t write about versatility in my original post: instead I focused on the hit to productivity. But as I think about it now, versatility is really what’s at play here — versatility really meant that Rust was useful for high-level things <em>and</em> low-level things, and I think that requiring an explicit dyn adaptor is unquestionably a hit against being high-level. Interestingly, I put versatility <em>after</em> transparency in the list, meaning that it was lower priority, and that seems to back up the decision to have some kind of explicit adaptor.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>At this point, some folks point out all the myriad subtleties and details that are actually hidden in C code. Hush you.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>I remember a colleague at a past job discovering that somebody had overloaded the <code>-&gt;</code> operator in our codebase. They sent out an angry email, “When does it stop? Must I examine every dot and squiggle in the code?” (NB: Rust supports overloading the deref operator.)&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>Put another way, being transparent about one thing can make other things more obscure (“can’t see the forest for the trees”).&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/dyn-async-traits" term="dyn-async-traits" label="Dyn async traits"/></entry><entry><title type="html">Dyn async traits, part 8: the soul of Rust</title><link href="https://smallcultfollowing.com/babysteps/blog/2022/09/18/dyn-async-traits-part-8-the-soul-of-rust/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2022/09/18/dyn-async-traits-part-8-the-soul-of-rust/</id><published>2022-09-18T00:00:00+00:00</published><updated>2022-09-18T13:49:00-04:00</updated><content type="html"><![CDATA[<p>In the last few months, Tyler Mandry and I have been circulating a <a href="https://hackmd.io/@nikomatsakis/SJ2-az7sc">“User’s Guide from the Future”</a> that describes our current proposed design for async functions in traits. In this blog post, I want to deep dive on one aspect of that proposal: how to handle dynamic dispatch. My goal here is to explore the space a bit and also to address one particularly tricky topic: how explicit do we have to be about the possibility of allocation? This is a tricky topic, and one that gets at that core question: what is the soul of Rust?</p>
<h3 id="the-running-example-trait">The running example trait</h3>
<p>Throughout this blog post, I am going to focus exclusively on this example trait, <code>AsyncIterator</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">AsyncIterator</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>And we’re particularly focused on the scenario where we are invoking <code>next</code> via dynamic dispatch:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">make_dyn</span><span class="o">&lt;</span><span class="no">AI</span>: <span class="nc">AsyncIterator</span><span class="o">&gt;</span><span class="p">(</span><span class="n">ai</span>: <span class="nc">AI</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">use_dyn</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">ai</span><span class="p">);</span><span class="w"> </span><span class="c1">// &lt;— coercion from `&amp;mut AI` to `&amp;mut dyn AsyncIterator`
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">use_dyn</span><span class="p">(</span><span class="n">di</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="k">dyn</span><span class="w"> </span><span class="n">AsyncIterator</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">di</span><span class="p">.</span><span class="n">next</span><span class="p">().</span><span class="k">await</span><span class="p">;</span><span class="w"> </span><span class="c1">// &lt;— this call right here!
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Even though I’m focusing the blog post on this particular snippet of code, everything I’m talking about is applicable to any trait with methods that return <code>impl Trait</code> (async functions themselves being a shorthand for a function that returns <code>impl Future</code>).</p>
<p>The basic challenge that we have to face is this:</p>
<ul>
<li>The caller function, <code>use_dyn</code>, doesn’t know what impl is behind the <code>dyn</code>, so it needs to allocate a fixed amount of space that works for everybody. It also needs some kind of vtable so it knows what <code>poll</code> method to call.</li>
<li>The callee, <code>AI::next</code>, needs to be able to package up the future for its <code>next</code> function in some way to fit the caller’s expectations.</li>
</ul>
<p>The <a href="https://smallcultfollowing.com/babysteps/blog/2021/09/30/dyn-async-traits-part-1/">first blog post in this series</a><sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> explains the problem in more detail.</p>
<h3 id="a-brief-tour-through-the-options">A brief tour through the options</h3>
<p>One of the challenges here is that there are many, many ways to make this work, and none of them is “obviously best”. What follows is, I think, an exhaustive list of the various ways one might handle the situation. If anybody has an idea that doesn’t fit into this list, I’d love to hear it.</p>
<p><strong>Box it.</strong> The most obvious strategy is to have the callee box the future type, effectively returning a <code>Box&lt;dyn Future&gt;</code>, and have the caller invoke the <code>poll</code> method via virtual dispatch. This is what the <a href="https://crates.io/crates/async-trait"><code>async-trait</code></a> crate does (although it also boxes for static dispatch, which we don’t have to do).</p>
<p><strong>Box it with some custom allocator.</strong> You might want to box the future with a custom allocator.</p>
<p><strong>Box it and cache box in the caller.</strong> For most applications, boxing itself is not a performance problem, unless it occurs repeatedly in a tight loop. Mathias Einwag pointed out if you have some code that is repeatedly calling <code>next</code> on the same object, you could have that caller cache the box in between calls, and have the callee reuse it. This way you only have to actually allocate once.</p>
<p><strong>Inline it into the iterator.</strong> Another option is to store all the state needed by the function in the <code>AsyncIter</code> type itself. This is actually what the existing <code>Stream</code> trait does, if you think about it: instead of returning a future, it offers a <code>poll_next</code> method, so that the implementor of <code>Stream</code> effectively <em>is</em> the future, and the caller doesn’t have to store any state. Tyler and I worked out a more general way to do inlining that doesn’t require user intervention, where you basically wrap the <code>AsyncIterator</code> type in another type <code>W</code> that has a field big enough to store the <code>next</code> future. When you call <code>next</code>, this wrapper <code>W</code> stores the future into that field and then returns a pointer to the field, so that the caller only has to poll that pointer. <strong>One problem with inlining things into the iterator is that it only works well for <code>&amp;mut self</code> methods</strong>, since in that case there can be at most one active future at a time. With <code>&amp;self</code> methods, you could have any number of active futures.</p>
<p><strong>Box it and cache box in the callee.</strong> Instead of inlining the entire future into the <code>AsyncIterator</code> type, you could inline just one pointer-word slot, so that you can cache and reuse the <code>Box</code> that <code>next</code> returns. The upside of this strategy is that the cached box moves with the iterator and can potentially be reused across callers. The downside is that once the caller has finished, the cached box lives on until the object itself is destroyed.</p>
<p><strong>Have caller allocate maximal space.</strong> Another strategy is to have the caller allocate a big chunk of space on the stack, one that should be big enough for every callee. If you know the callees your code will have to handle, and the futures for those callees are close enough in size, this strategy works well. Eric Holk recently released the [stackfuture crate] that can help automate it. <strong>One problem with this strategy is that the caller has to know the size of all its callees.</strong></p>
<p><strong>Have caller allocate some space, and fall back to boxing for large callees.</strong> If you don’t know the sizes of all your callees, or those sizes have a wide distribution, another strategy might be to have the caller allocate some amount of stack space (say, 128 bytes) and then have the callee invoke <code>Box</code> if that space is not enough.</p>
<p><strong>Alloca on the caller side.</strong> You might think you can store the size of the future to be returned in the vtable and then have the caller “alloca” that space — i.e., bump the stack pointer by some dynamic amount. Interestingly, this doesn’t work with Rust’s async model. Async tasks require that the size of the stack frame is known up front.</p>
<p><strong>Side stack.</strong> Similar to the previous suggestion, you could imagine having the async runtimes provide some kind of “dynamic side stack” for each task.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> We could then allocate the right amount of space on this stack. This is probably the most efficient option, but it assumes that the runtime is able to provide a dynamic stack. Runtimes like <a href="https://github.com/embassy-rs/embassy">embassy</a> wouldn’t be able to do this. Moreover, we don’t have any sort of protocol for this sort of thing right now. Introducing a side-stack also starts to “eat away” at some of the appeal of Rust’s async model, which is <a href="https://without.boats/blog/futures-and-segmented-stacks/">designed to allocate the “perfect size stack” up front</a> and avoid the need to allocate a “big stack per task”.<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></p>
<h3 id="can-async-functions-used-with-dyn-be-normal">Can async functions used with dyn be “normal”?</h3>
<p>One of my initial goals for async functions in traits was that they should feel “as natural as possible”. In particular, I wanted you to be able to use them with dynamic dispatch in just the same way as you would a synchronous function. In other words, I wanted this code to compile, and I would want it to work even if <code>use_dyn</code> were put into another crate (and therefore were compiled with no idea of who is calling it):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">make_dyn</span><span class="o">&lt;</span><span class="no">AI</span>: <span class="nc">AsyncIterator</span><span class="o">&gt;</span><span class="p">(</span><span class="n">ai</span>: <span class="nc">AI</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">use_dyn</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">ai</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">use_dyn</span><span class="p">(</span><span class="n">di</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="k">dyn</span><span class="w"> </span><span class="n">AsyncIterator</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">di</span><span class="p">.</span><span class="n">next</span><span class="p">().</span><span class="k">await</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>My hope was that we could make this code work <em>just as it is</em> by selecting some kind of default strategy that works most of the time, and then provide ways for you to pick other strategies for those code where the default strategy is not a good fit. The problem though is that there is no single default strategy that seems “obvious and right almost all of the time”…</p>
<table>
  <thead>
      <tr>
          <th>Strategy</th>
          <th>Downside</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Box it (with default allocator)</td>
          <td>requires allocation, not especially efficient</td>
      </tr>
      <tr>
          <td>Box it with cache on caller side</td>
          <td>requires allocation</td>
      </tr>
      <tr>
          <td>Inline it into the iterator</td>
          <td>adds space to <code>AI</code>, doesn’t work for <code>&amp;self</code></td>
      </tr>
      <tr>
          <td>Box it with cache on callee side</td>
          <td>requires allocation, adds space to <code>AI</code>, doesn’t work for <code>&amp;self</code></td>
      </tr>
      <tr>
          <td>Allocate maximal space</td>
          <td>can’t necessarily use that across crates, requires extensive interprocedural analysis</td>
      </tr>
      <tr>
          <td>Allocate some space, fallback</td>
          <td>uses allocator, requires extensive interprocedural analysis or else random guesswork</td>
      </tr>
      <tr>
          <td>Alloca on the caller side</td>
          <td>incompatible with async Rust</td>
      </tr>
      <tr>
          <td>Side-stack</td>
          <td>requires cooperation from runtime and allocation</td>
      </tr>
  </tbody>
</table>
<h3 id="the-soul-of-rust">The soul of Rust</h3>
<p>This is where we get to the “soul of Rust”. Looking at the above table, the strategy that seems the closest to “obviously correct” is “box it”. It works fine with separate compilation, fits great with Rust’s async model, and it matches what people are doing today in practice. I’ve spoken with a fair number of people who use async Rust in production, and virtually all of them agreed that “box by default, but let me control it” would work great in practice.</p>
<p>And yet, when we floated the idea of using this as the default, Josh Triplett objected strenuously, and I think for good reason. Josh’s core concern was that this would be crossing a line for Rust. Until now, there is no way to allocate heap memory without some kind of explicit operation (though that operation could be a function call). But if we wanted make “box it” the default strategy, then you’d be able to write “innocent looking” Rust code that nonetheless <em>is</em> invoking <code>Box::new</code>. In particular, it would be invoking <code>Box::new</code> each time that <code>next</code> is called, to box up the future. But that is very unclear from reading over <code>make_dyn</code> and <code>use_dyn</code>.</p>
<p>As an example of where this might matter, it might be that you are writing some sensitive systems code where allocation is something you always do with great care. It doesn’t mean the code is no-std, it may have access to an allocator, but you still would like to know exactly where you will be doing allocations. Today, you can audit the code by hand, scanning for “obvious” allocation points like <code>Box::new</code> or <code>vec![]</code>. Under this proposal, while it would still be <em>possible</em>, the presence of an allocation in the code is much less obvious. The allocation is “injected” as part of the vtable construction process. To figure out that this will happen, you have to know Rust’s rules quite well, and you also have to know the signature of the callee (because in this case, the vtable is built as part of an implicit coercion). In short, scanning for allocation went from being relatively obvious to requiring a PhD in Rustology. Hmm.</p>
<p>On the other hand, if scanning for allocations is what is important, we could address that in many ways. We could add an “allow by default” lint to flag the points where the “default vtable” is constructed, and you could enable it in your project. This way the compiler would warn you about the possible future allocation. In fact, even today, scanning for allocations is actually much harder than I made it ought to be: you can easily see if your function allocates, but you can’t easily see what its callees do. You have to read deeply into all of your dependencies and, if there are function pointers or <code>dyn Trait</code> values, figure out what code is potentially being called. With compiler/language support, we could make that whole process much more first-class and better.</p>
<p>In a way, though, the technical arguments are besides the point. “Rust makes allocations explicit” is widely seen as a key attribute of Rust’s design. In making this change, we would be tweaking that rule to be something like ”Rust makes allocations explicit <em>most of the time</em>”. This would be harder for users to understand, and it would introduce doubt as whether Rust <em>really</em> intends to be the kind of language that can replace C and C++<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup>.</p>
<h3 id="looking-to-the-rustacean-design-principles-for-guidance">Looking to the Rustacean design principles for guidance</h3>
<p>Some time back, Josh and I drew up a draft set of design principles for Rust. It’s interesting to look back on them and see what they have to say about this question:</p>
<ul>
<li>⚙️ Reliable: &ldquo;if it compiles, it works&rdquo;</li>
<li>🐎 Performant: &ldquo;idiomatic code runs efficiently&rdquo;</li>
<li>🥰 Supportive: &ldquo;the language, tools, and community are here to help&rdquo;</li>
<li>🧩 Productive: &ldquo;a little effort does a lot of work&rdquo;</li>
<li>🔧 Transparent: &ldquo;you can predict and control low-level details&rdquo;</li>
<li>🤸 Versatile: &ldquo;you can do anything with Rust&rdquo;</li>
</ul>
<p>Boxing by default, to my mind, scores as follows:</p>
<ul>
<li><strong>🐎 Performant: meh.</strong> The real goal with performant is that the cleanest code also runs the <em>fastest</em>. Boxing on every dynamic call doesn’t meet this goal, but something like “boxing with caller-side caching” or “have caller allocate space and fall back to boxing” very well might.</li>
<li><strong>🧩 Productive: yes!</strong> Virtually every production user of async Rust that I’ve talked to has agreed that having code box by default would (but giving the option to do something else for tight loops) would be a great sweet spot for Rust.</li>
<li><strong>🔧 Transparent: no.</strong> As I wrote before, understanding when a call may box now requires a PhD in Rustology, so this definitely fails on transparency.</li>
</ul>
<p>(The other principles are not affected in any notable way, I don&rsquo;t think.)</p>
<h3 id="what-the-users-guide-from-the-future-suggests">What the “user’s guide from the future” suggests</h3>
<p>These considerations led Tyler and I to a different design. In the <a href="https://hackmd.io/@nikomatsakis/SJ2-az7sc">“User’s Guide From the Future”</a> document from before, you’ll see that it does not accept the running example just as is. Instead, if you were to compile the example code we’ve been using thus far, you’d get an error:</p>
<pre tabindex="0"><code>error[E0277]: the type `AI` cannot be converted to a
              `dyn AsyncIterator` without an adapter
 --&gt; src/lib.rs:3:23
  |
3 |     use_dyn(&amp;mut ai);
  |                  ^^ adapter required to convert to `dyn AsyncIterator`
  |
  = help: consider introducing the `Boxing` adapter,
    which will box the futures returned by each async fn
3 |     use_dyn(&amp;mut Boxing::new(ai));
                     ++++++++++++  +
</code></pre><p>As the error suggests, in order to get the boxing behavior, you have to opt-in via a type that we called <code>Boxing</code><sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">make_dyn</span><span class="o">&lt;</span><span class="no">AI</span>: <span class="nc">AsyncIterator</span><span class="o">&gt;</span><span class="p">(</span><span class="n">ai</span>: <span class="nc">AI</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">use_dyn</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">Boxing</span>::<span class="n">new</span><span class="p">(</span><span class="n">ai</span><span class="p">));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//          ^^^^^^^^^^^
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">use_dyn</span><span class="p">(</span><span class="n">di</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="k">dyn</span><span class="w"> </span><span class="n">AsyncIterator</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">di</span><span class="p">.</span><span class="n">next</span><span class="p">().</span><span class="k">await</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Under this design, you can only create a <code>&amp;mut dyn AsyncIterator</code> when the caller can verify that the <code>next</code> method returns a type from which a <code>dyn*</code> can be constructed. If that’s not the case, and it’s usually not, you can use the <code>Boxing::new</code> adapter to create a <code>Boxing&lt;AI&gt;</code>. Via some kind of compiler magic that <em>ahem</em> we haven’t fully worked out yet<sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup>, you could coerce a <code>Boxing&lt;AI&gt;</code> into a <code>dyn AsyncIterator</code>.</p>
<p><strong>The details of the <code>Boxing</code> type need more work<sup id="fnref:7"><a href="#fn:7" class="footnote-ref" role="doc-noteref">7</a></sup>, but the basic idea remains the same: require users to make <em>some</em> explicit opt-in to the default vtable strategy, which may indeed perform allocation.</strong></p>
<h3 id="how-does-boxing-rank-on-the-design-principles">How does <code>Boxing</code> rank on the design principles?</h3>
<p>To my mind, adding the <code>Boxing</code> adapter ranks as follows…</p>
<ul>
<li><strong>🐎 Performant: meh.</strong> This is roughly the same as before. We’ll come back to this.</li>
<li><strong>🥰 Supportive: yes!</strong> The error message guides you to exactly what you need to do, and hopefully links to a well-written explanation that can help you learn about why this is required.</li>
<li><strong>🧩 Productive: meh.</strong> Having to add <code>Boxing::new</code> call each time you create a <code>dyn AsyncIterator</code> is not great, but also on-par with other Rust papercuts.</li>
<li><strong>🔧 Transparent: yes!</strong> It is easy to see that boxing may occur in the future now.</li>
</ul>
<p>This design is now transparent. It’s also less productive than before, but we’ve tried to make up for it with supportiveness. “Rust isn’t always easy, but it’s always helpful.”</p>
<h3 id="improving-performance-with-a-more-complex-abi">Improving performance with a more complex ABI</h3>
<p>One thing that bugs me about the “box by default” strategy is that the performance is only “meh”. I like stories like <code>Iterator</code>, where you write nice code and you get tight loops. It bothers me that writing “nice” async code yields a naive, middling efficiency story.</p>
<p>That said, I think this is something we could fix in the future, and I think we could fix it backwards compatibly. The idea would be to extend our ABI when doing virtual calls so that the caller has the <em>option</em> to provide some “scratch space” for the callee. For example, we could then do things like analyze the binary to get a good guess as to how much stack space is needed (either by doing dataflow or just by looking at all implementations of <code>AsyncIterator</code>). We could then have the caller reserve stack space for the future and pass a pointer into the callee — the callee would still have the <em>option</em> of allocating, if for example, there wasn’t enough stack space, but it could make use of the space in the common case.</p>
<p>Interestingly, I think that if we did this, we would also be putting some pressure on Rust’s “transparency” story again. While Rust’s leans heavily on optimizations to get performance, we’ve generally restricted ourselves to simple, local ones like inlining; we don’t require interprocedural dataflow in particular, although of course it helps (and LLVM does it). But getting a good estimate of how much stack space to reserve for potential calleees would violate that rule (we’d also need some simple escape analysis, as I describe in <a href="#Appendix-A-futures-that-escape-the-stack-frame">Appendix A</a>). All of this adds up to a bit of ‘performance unpredictability’. Still, I don’t see this as a big problem, particularly since the fallback is just to use <code>Box::new</code>, and as we’ve said, for most users that is perfectly adequate.</p>
<h3 id="picking-another-strategy-such-as-inlining">Picking another strategy, such as inlining</h3>
<p>Of course, maybe you don’t want to use <code>Boxing</code>. It would also be possible to construct other kinds of adapters, and they would work in a similar fashion. For example, an inlining adapter might look like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">make_dyn</span><span class="o">&lt;</span><span class="no">AI</span>: <span class="nc">AsyncIterator</span><span class="o">&gt;</span><span class="p">(</span><span class="n">ai</span>: <span class="nc">AI</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">use_dyn</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">InlineAsyncIterator</span>::<span class="n">new</span><span class="p">(</span><span class="n">ai</span><span class="p">));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//           ^^^^^^^^^^^^^^^^^^^^^^^^
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The <code>InlineAsyncIterator&lt;AI&gt;</code> type would add the extra space to store the future, so that when the <code>next</code> method is called, it writes the future into its own fields and then returns it to the caller. Similarly, a cached box adapter might be <code>&amp;mut CachedAsyncIterator::new(ai)</code>, only it would use a field to cache the resulting <code>Box</code>.</p>
<p>You may have noticed that the inline/cached adapters include the name of the trait. That’s because they aren’t relying on compiler magic like Boxing, but are instead intended to be authored by end-users, and we don’t yet have a way to be generic over any trait definition. (The proposal as we wrote it uses macros to generate an adapter type for any trait you wish to adapt.) This is something I’d love to address in the future. <a href="https://rust-lang.github.io/async-fundamentals-initiative/explainer/inline_async_iter_adapter.html">You can read more about how adapters work here.</a></p>
<h3 id="conclusion">Conclusion</h3>
<p>OK, so let’s put it all together into a coherent design proposal:</p>
<ul>
<li>You cannot coerce from an arbitrary type <code>AI</code> into a <code>dyn AsyncIterator</code>. Instead, you must select an adaptor:
<ul>
<li>Typically you want <code>Boxing</code>, which has a decent performance profile and “just works”.</li>
<li>But users can write their own adapters to implement other strategies, such as <code>InlineAsyncIterator</code> or <code>CachingAsyncIterator</code>.</li>
</ul>
</li>
<li>From an implementation perspective:
<ul>
<li>When invoked via dynamic dispatch, async functions return a <code>dyn* Future</code>. The caller can invoke <code>poll</code> via virtual dispatch and invoke the (virtual) drop function when it’s ready to dispose of the future.</li>
<li>The vtable created for <code>Boxing&lt;AI&gt;</code> will allocate a box to store the future <code>AI::next()</code> and use that to create the <code>dyn* Future</code>.</li>
<li>The vtable for other adapters can use whatever strategy they want. <code>InlineAsyncIterator&lt;AI&gt;</code>, for example, stores the <code>AI::next()</code> future into a field in the wrapper, takes a raw pointer to that field, and creates a <code>dyn* Future</code> from this raw pointer.</li>
</ul>
</li>
<li>Possible future extension for better performance:<sup id="fnref:8"><a href="#fn:8" class="footnote-ref" role="doc-noteref">8</a></sup>
<ul>
<li>We modify the ABI for async trait functions (or any trait function using return-position impl trait) to allow the caller to optionally provide stack space. The <code>Boxing</code> adapter, if such stack space is available, will use it to avoid boxing when it can. This would have to be coupled with some compiler analysis to figure out how much to stack space to pre-allocate.</li>
</ul>
</li>
</ul>
<p>This lets us express virtually any pattern. Its even <em>possible</em> to express side-stacks, if the runtime provides a suitable adapter (e.g., <code>TokioSideStackAdapter::new(ai)</code>), though if side-stacks become popular I would rather consider a more standard means to expose them.</p>
<p>The main downsides to this proposal are:</p>
<ul>
<li>Users have to write <code>Boxing::new</code>, which is a productivity and learnability hit, but it avoids a big hit to transparency. Is that the right call? I’m still not entirely sure, though my heart increasingly says yes. It’s also something we could revisit in the future (e.g., and add a default adapter).</li>
<li>If we opt to modify the ABI, we’re adding some complexity there, but in exchange for potentially quite a lot of performance. I would expect us not to do this initially, but to explore it as an extension in the future once we have more data about how important it is.</li>
</ul>
<p>There is one pattern that we can’t express: “have caller allocate maximal space”. This pattern <em>guarantees</em> that heap allocation is not needed; the best we can do is a heuristic that <em>tries</em> to avoid heap allocation, since we have to consider public functions on crate boundaries and the like. To offer a guarantee, the argument type needs to change from <code>&amp;mut dyn AsyncIterator</code> (which accepts any async iterator) to something narrower. This would also support futures that escape the stack frame (see <a href="#Appendix-A-futures-that-escape-the-stack-frame">Appendix A</a> below). It seems likely that these details don’t matter, and that either inline futures or heuristics would suffice, but if not, a crate like <a href="https://github.com/microsoft/stackfuture">stackfuture</a> remains an option.</p>
<h3 id="comments">Comments?</h3>
<p>Please leave comments in <a href="https://internals.rust-lang.org/t/blog-series-dyn-async-in-traits-continues/17403">this internals thread</a>. Thanks!</p>
<h3 id="appendix-a-futures-that-escape-the-stack-frame">Appendix A: futures that escape the stack frame</h3>
<p>In all of this discussion, I’ve been assuming that the async call was followed closely by an await. But what happens if the future is not awaited, but instead is moved into the heap or other locations?</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="p">(</span><span class="n">x</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="k">dyn</span><span class="w"> </span><span class="n">AsyncIterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kt">u32</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">impl</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Option</span><span class="o">&lt;</span><span class="kt">u32</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="err">‘</span><span class="n">_</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">x</span><span class="p">.</span><span class="n">next</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>For boxing, this kind of code doesn’t pose any problem at all. But if we had allocated space on the stack to store the future, examples like this would be a problem. So long as the scratch space is optional, with a fallback to boxing, this is no problem. We can do an escape analysis and avoid the use of scratch space for examples like this.</p>
<h3 id="footnotes">Footnotes</h3>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Written in Sep 2020, egads!&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>I was intrigued to learn that this is what Ada does, and that Ada features like returning dynamically sized types are built on this model. I’m not sure how <a href="https://www.adacore.com/about-spark">SPARK</a> and other Ada subsets that target embedded spaces manage that, I’d like to learn more about it.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Of course, without a side stack, we are left using mechanisms like <code>Box::new</code> to cover cases like dynamic dispatch or recursive functions. This becomes a kind of pessimistically sized segmented stack, where we allocate for each little piece of extra state that we need. A side stack might be an appealing middle ground, but because of cases like <code>embassy</code>, it can’t be the only option.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>Ironically, C++ itself inserts implicit heap allocations to help with coroutines!&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>Suggestions for a better name very welcome.&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p>Pay no attention to the compiler author behind the curtain. 🪄 🌈 Avert your eyes!&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:7">
<p>e.g., if you look closely at the <a href="https://hackmd.io/@nikomatsakis/SJ2-az7sc">User&rsquo;s Guide from the Future</a>, you&rsquo;ll see that it writes <code>Boxing::new(&amp;mut ai)</code>, and not <code>&amp;mut Boxing::new(ai)</code>. I go back and forth on this one.&#160;<a href="#fnref:7" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:8">
<p>I should clarify that, while Tyler and I have discussed this, I don&rsquo;t know how he feels about it. I wouldn&rsquo;t call it &lsquo;part of the proposal&rsquo; exactly, more like an extension I am interested in.&#160;<a href="#fnref:8" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/dyn-async-traits" term="dyn-async-traits" label="Dyn async traits"/></entry><entry><title type="html">Come contribute to Salsa 2022!</title><link href="https://smallcultfollowing.com/babysteps/blog/2022/08/18/come-contribute-to-salsa-2022/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2022/08/18/come-contribute-to-salsa-2022/</id><published>2022-08-18T00:00:00+00:00</published><updated>2022-08-18T19:47:00-04:00</updated><content type="html"><![CDATA[<p>Have you heard of the <a href="https://github.com/salsa-rs/salsa">Salsa</a> project? <strong>Salsa</strong> is a library for incremental computation &ndash; it&rsquo;s used by rust-analyzer, for example, to stay responsive as you type into your IDE (we have also <a href="https://rust-lang.zulipchat.com/#narrow/stream/238009-t-compiler.2Fmeetings/topic/.5Bsteering.20meeting.5D.202022-04-15.20compiler-team.23507/near/279082491">discussed using it in rustc</a>, though more work is needed there). We are in the midst of a big push right now to develop and release <strong>Salsa 2022</strong>, a major new revision to the API that will make Salsa far more natural to use. I&rsquo;m writing this blog post both to advertise that ongoing work and to put out a <strong>call for contribution</strong>. Salsa doesn&rsquo;t yet have a large group of maintainers, and I would like to fix that. If you&rsquo;ve been looking for an open source project to try and get involved in, maybe take a look at our <a href="https://github.com/salsa-rs/salsa/issues/305">Salsa 2022 tracking issue</a> and see if there is an issue you&rsquo;d like to tackle?</p>
<h3 id="so-wait-what-does-salsa-do">So wait, <em>what</em> does Salsa do?</h3>
<p>Salsa is designed to help you build programs that respond to rapidly changing inputs. The prototypical example is a compiler, especially an IDE. You&rsquo;d like to be able to do things like &ldquo;jump to definition&rdquo; and keep those results up-to-date even as the user is actively typing. Salsa can help you build programs that manage that.</p>
<p>The key way that Salsa achieves reuse is through memoization. The idea is that you define a function that does some specific computation, let&rsquo;s say it has the job of parsing the input and creating the Abstract Syntax Tree (AST):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">parse_program</span><span class="p">(</span><span class="n">input</span>: <span class="kp">&amp;</span><span class="kt">str</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">AST</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Then later I have other functions that might take parts of that AST and operate on them, such as type-checking:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">type_check</span><span class="p">(</span><span class="n">function</span>: <span class="kp">&amp;</span><span class="nc">AstFunction</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>In a setup like this, I would like to have it so that when my base input changes, I do have to re-parse but I don&rsquo;t necessarily have to run the type checker. For example, if the only change to my progam was to add a comment, then maybe my AST is not affected, and so I don&rsquo;t need to run the type checker again. Or perhaps the AST contains many functions, and only one of them changed, so while I have to type check that function, I don&rsquo;t want to type check the others. Salsa can help you manage this sort of thing automatically.</p>
<h2 id="what-is-salsa-2022-and-how-is-it-different">What is Salsa 2022 and how is it different?</h2>
<p>The original salsa system was modeled very closely on the [rustc query system]. As such, it required you to structure your program entirely in terms of functions and queries that called one another. All data was passed through return values. This is a very powerful and flexible system, but it can also be kind of mind-bending sometimes to figure out how to &ldquo;close the loop&rdquo;, particularly if you wanted to get effective re-use, or do lazy computation.</p>
<p>Just looking at the <code>parse_program</code> function we saw before, it was defined to return a complete AST:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">parse_program</span><span class="p">(</span><span class="n">input</span>: <span class="kp">&amp;</span><span class="kt">str</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">AST</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>But that AST has, internally, a lot of structure. For example, perhaps an AST looks like a set of functions:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Ast</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">functions</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">AstFunction</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">AstFunction</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">name</span>: <span class="nc">Name</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">body</span>: <span class="nc">AstFunctionBody</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">AstFunctionBody</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Under the old Salsa, changes were tracked at a pretty coarse-grained level. So if your input changed, and the content of <em>any</em> function body changed, then your entire AST was considered to have changed. If you were naive about it, this would mean that everything would have to be type-checked again. In order to get good reuse, you had to change the structure of your program pretty dramatically from the &ldquo;natural structure&rdquo; that you started with.</p>
<h2 id="enter-tracked-structs">Enter: tracked structs</h2>
<p>The newer Salsa introduces <strong>tracked structs</strong>, which makes this a lot easier. The idea is that you can label a struct as tracked, and now its fields become managed by the database:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cp">#[salsa::tracked]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">AstFunction</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">name</span>: <span class="nc">Name</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">body</span>: <span class="nc">AstFunctionBody</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>When a struct is declared as tracked, then we also track accesses to its fields. This means that if the parser produces the same <em>set</em> of functions, then its output is considered not to have changed, even if the function bodies are different. When the type checker reads the function body, we&rsquo;ll track that read independently. So if just one function has changed, only that function will be type checked again.</p>
<h2 id="goal-relatively-natural">Goal: relatively natural</h2>
<p>The goal of Salsa 2022 is that you should be able to convert a program to use Salsa without dramatically restructuring it. It should still feel quite similar to the &rsquo;natural structure&rsquo; that you would have used if you didn&rsquo;t care about incremental reuse.</p>
<p>Using techniques like tracked structs, you can keep the pattern of a compiler as a kind of &ldquo;big function&rdquo; that passes the input through many phases, while still getting pretty good re-use:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">typical_compiler</span><span class="p">(</span><span class="n">input</span>: <span class="kp">&amp;</span><span class="kt">str</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">ast</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">parse_ast</span><span class="p">(</span><span class="n">input</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="w"> </span><span class="n">function</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="o">&amp;</span><span class="n">ast</span><span class="p">.</span><span class="n">functions</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">type_check</span><span class="p">(</span><span class="n">function</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Salsa 2022 also has other nice features, such as <a href="https://salsa-rs.github.io/salsa/overview.html#accumulators">accumulators</a> for managing diagnostics and <a href="https://salsa-rs.github.io/salsa/overview.html#interned-structs">built-in interning</a>.</p>
<p>If you&rsquo;d like to learn more about how Salsa works, check out the <a href="https://salsa-rs.github.io/salsa/overview.html">overview page</a> or read through the (WIP) <a href="https://salsa-rs.github.io/salsa/tutorial.html">tutorial</a>, which covers the design of a complete compiler and interpreter.</p>
<h2 id="how-to-get-involved">How to get involved</h2>
<p>As I mentioned, the purpose of this blog post is to serve as a <strong>call for contribution</strong>. Salsa is a cool project but it doesn&rsquo;t have a lot of active maintainers, and we are actively looking to recruit new people.</p>
<p>The <a href="https://github.com/salsa-rs/salsa/issues/305">Salsa 2022 tracking issue</a> contains a list of possible items to work on. Many of those items have mentoring instructions, just search for things tagged with <a href="https://github.com/salsa-rs/salsa/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22+label%3Asalsa-2022">good first issue</a>. There is also <a href="https://salsa-rs.github.io/salsa/plumbing.html">documentation of salsa&rsquo;s internal structure on the main web page</a> that can help you navigate the code base. Finally, we have a <a href="https://salsa.zulipchat.com/">Zulip instance</a> where we hang out and chat (the <a href="https://salsa.zulipchat.com/#narrow/stream/146365-good-first-issue"><code>#good-first-issue</code> stream</a> is a good place to ask for help!)</p>
]]></content></entry><entry><title type="html">Many modes: a GATs pattern</title><link href="https://smallcultfollowing.com/babysteps/blog/2022/06/27/many-modes-a-gats-pattern/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2022/06/27/many-modes-a-gats-pattern/</id><published>2022-06-27T00:00:00+00:00</published><updated>2022-06-27T10:00:00-04:00</updated><content type="html"><![CDATA[<p>As some of you may know, on May 4th <a href="https://github.com/jackh726/">Jack Huey</a> opened a <a href="https://github.com/rust-lang/rust/pull/96709">PR to stabilize an initial version of generic associated types</a>. The current version is at best an MVP: the compiler support is limited, resulting in unnecessary errors, and the syntax is limited, making code that uses GATs much more verbose than I&rsquo;d like. Nonetheless, I&rsquo;m super excited, since GATs unlock a lot of interesting use cases, and we can continue to smooth out the rough edges over time. However, folks on the thread have raised some <a href="https://github.com/rust-lang/rust/pull/96709#issuecomment-1129311660">strong concerns about GAT stabilization</a>, including asking whether GATs are worth including in the language at all. The fear is that they make Rust the language too complex, and that it would be better to just use them as an internal building block for other, more accessible features (like async functions and [return position impl trait in traits][RPITIT]).  In response to this concern, a number of people have posted about how they are using GATs. I recently took some time to deep dive into these comments and to write about some of the patterns that I found there, including a pattern I am calling the &ldquo;many modes&rdquo; pattern, which comes from the <a href="https://github.com/zesterer/chumsky/">chumsky</a> parser combinator library. I posted about this pattern <a href="https://github.com/rust-lang/rust/pull/96709#issuecomment-1167220240">on the thread</a>, but I thought I would cross-post my write-up here to the blog as well, because I think it&rsquo;s of general interest.</p>
<h3 id="general-thoughts-from-reading-the-examples">General thoughts from reading the examples</h3>
<p>I&rsquo;ve been going through the (many, many) examples that people have posted where they are relying on GATs and look at them in a bit more detail. A few interesting things jumped out at me as I read through the examples:</p>
<ul>
<li><strong>Many of the use-cases involve GATs with type parameters.</strong> There has been some discussion of stabilizing &ldquo;lifetime-only&rdquo; GATs, but I don&rsquo;t think that makes sense from any angle. It&rsquo;s more complex for the implementation and, I think, more confusing for the user. But also, given that the &ldquo;workaround&rdquo; for not having GATs tends to be higher-ranked trait bounds (HRTB), and given that those only work for lifetimes, it means we&rsquo;re losing one of the primary benefits of GATs in practice (note that I do expect to get HRTB for types in the near-ish future).</li>
<li><strong>GATs allowed libraries to better <em>hide</em> details from their clients.</strong> This is precisely because they could make a trait hierarchy that more directly captured the &ldquo;spirit&rdquo; of the trait, resulting in bounds like <code>M: Mode</code> instead of higher-ranked trait bounds (in some cases, the HRTB would have to be over types, like <code>for&lt;X&gt; M: Mode&lt;X&gt;</code>, which isn&rsquo;t even legal in Rust&hellip;yet).</li>
</ul>
<p>As I read, I felt this fit a pattern that I&rsquo;ve experienced many times but hadn&rsquo;t given a name to: when traits are being used to describe a situation that they don&rsquo;t quite fit, <em>the result is an explosion of where-clauses on the clients</em>. Sometimes you can hide these via supertraits or something, but those complex bounds are still visible in rustdoc, still leak out in error mesages, and don&rsquo;t generally &ldquo;stay hidden&rdquo; as well as you&rsquo;d like. You&rsquo;ll see this come up here when I talk about how you would model this pattern in Rust today, but it&rsquo;s a comon theme across all examples. <a href="https://github.com/RustAudio/rust-lv2/issues/95">Issue #95 on the <code>RustAudio</code> crate</a> for example says, &ldquo;The first [solution] would be to make <code>PortType</code> generic over a <code>'a</code> lifetime&hellip;however, this has a cascading effect, which would force all downstream users of port types to specify their lifetimes&rdquo;. <a href="https://github.com/rust-lang/rust/pull/96709#issuecomment-1150127168">Pythonesque made a simpler point here</a>, &ldquo;Without GATs, I ended up having to make an Hkt trait that had to be implemented for every type, define its projections, and then make everything heavily parametric and generic over the various conversions.&rdquo;</p>
<h3 id="the-many-modes-pattern-chumsky">The &ldquo;many modes&rdquo; pattern (chumsky)</h3>
<p>The first example I looked at closely was the <a href="https://github.com/rust-lang/rust/pull/96709#issuecomment-1118409546">chumsky parsing library</a>. This is leveraging a pattern that I would call the &ldquo;many modes&rdquo; pattern. The idea is that you have some &ldquo;core function&rdquo; but you want to execute this function in many different modes. Ideally, you&rsquo;d like to define the modes independently from the function, and you&rsquo;d like to be able to add more modes later without having to change the function at all. (If you&rsquo;re familiar with Haskell, monads are an example of this pattern; the monad specifies the &ldquo;mode&rdquo; in which some simple sequential function is executed.)</p>
<p>chumsky is a parser combinator library, so the &ldquo;core function&rdquo; is a parse function, defined in the <code>Parser</code> trait. Each <code>Parser</code> trait impl contains a function that indicates how to parse some particular construct in the grammar. Normally, this parser function builds up a data structure representing the parsed data. But sometimes you don&rsquo;t need the full results of the parse: sometimes you might just like to know if the parse succeeds or fails, without building the parsed version. Thus, the &ldquo;many modes&rdquo; pattern: we&rsquo;d like to be able to define our parser and then execute it against one of two modes, <em>emit</em> or <em>check</em>. The emit mode will build the data structure, but <em>check</em> will just check if the parse succeeds.</p>
<p>In the past, chumsky only had one mode, so they always built the data structure. This could take significant time and memory. Adding the &ldquo;check&rdquo; mode let&rsquo;s them skip that, which is a significant performance win. Moreover, the modes are encapsulated within the library traits, and aren&rsquo;t visible to end-users. Nice!</p>
<h3 id="how-did-chumsky-model-modes-with-gats">How did chumsky model modes with GATs?</h3>
<p>Chumsky added a <a href="https://github.com/zesterer/chumsky/blob/6a82f90ae4c1a4564e024eb0f63121fc7b7d3c18/src/zero_copy/mod.rs#L70"><code>Mode</code></a> trait, encapsulated as part of their <a href="https://github.com/zesterer/chumsky/blob/6a82f90ae4c1a4564e024eb0f63121fc7b7d3c18/src/zero_copy/mod.rs#L67"><code>internals</code></a> module. Instead of directly constructing the results from parsing, the <code>Parser</code> impls invoke methods on <a href="https://github.com/zesterer/chumsky/blob/6a82f90ae4c1a4564e024eb0f63121fc7b7d3c18/src/zero_copy/mod.rs#L70"><code>Mode</code></a> with closures. This allows the mode to decide which parts of the parsing to execute and which to skip. So, in check mode, the <a href="https://github.com/zesterer/chumsky/blob/6a82f90ae4c1a4564e024eb0f63121fc7b7d3c18/src/zero_copy/mod.rs#L70"><code>Mode</code></a> would decide not to execute the closure that builds the output data structure, for example.</p>
<p>Using this approach, the <a href="https://github.com/zesterer/chumsky/blob/6a82f90ae4c1a4564e024eb0f63121fc7b7d3c18/src/zero_copy/mod.rs#L137"><code>Parser</code></a> trait does indeed have several &rsquo;entrypoint&rsquo; methods, but they are all defaulted and just invoke a common implementation method called <code>go</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">trait</span><span class="w"> </span><span class="n">Parser</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="p">,</span><span class="w"> </span><span class="n">I</span>: <span class="nc">Input</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="o">?</span><span class="nb">Sized</span><span class="p">,</span><span class="w"> </span><span class="n">E</span>: <span class="nc">Error</span><span class="o">&lt;</span><span class="n">I</span>::<span class="n">Token</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(),</span><span class="w"> </span><span class="n">S</span>: <span class="na">&#39;a</span> <span class="o">=</span><span class="w"> </span><span class="p">()</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Output</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">parse</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">input</span>: <span class="kp">&amp;</span><span class="na">&#39;a</span> <span class="nc">I</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Output</span><span class="p">,</span><span class="w"> </span><span class="n">E</span><span class="o">&gt;</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">go</span>::<span class="o">&lt;</span><span class="n">Emit</span><span class="o">&gt;</span><span class="p">(</span><span class="o">..</span><span class="p">.)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">check</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">input</span>: <span class="kp">&amp;</span><span class="na">&#39;a</span> <span class="nc">I</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;</span><span class="p">(),</span><span class="w"> </span><span class="n">E</span><span class="o">&gt;</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">go</span>::<span class="o">&lt;</span><span class="n">Check</span><span class="o">&gt;</span><span class="p">(</span><span class="o">..</span><span class="p">.)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="cp">#[doc(hidden)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">go</span><span class="o">&lt;</span><span class="n">M</span>: <span class="nc">Mode</span><span class="o">&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">inp</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">InputRef</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="p">,</span><span class="w"> </span><span class="nb">&#39;_</span><span class="p">,</span><span class="w"> </span><span class="n">I</span><span class="p">,</span><span class="w"> </span><span class="n">E</span><span class="p">,</span><span class="w"> </span><span class="n">S</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">PResult</span><span class="o">&lt;</span><span class="n">M</span><span class="p">,</span><span class="w"> </span><span class="bp">Self</span>::<span class="n">Output</span><span class="p">,</span><span class="w"> </span><span class="n">E</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">Self</span>: <span class="nb">Sized</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Implementations of <code>Parser</code> <em>just</em> specify the <code>go</code> method. Note that the impls are, presumably, either contained within <code>chumsky</code> or generated by <code>chumsky</code> proc-macros, so the <code>go</code> method doesn&rsquo;t need to be documented. However, <em>even if <code>go</code> were documented</em>, the <em>trait bounds</em> certainly look quite reasonable. (The type of <code>inp</code> is a bit&hellip;imposing, admittedly.)</p>
<p>So how is the <a href="https://github.com/zesterer/chumsky/blob/6a82f90ae4c1a4564e024eb0f63121fc7b7d3c18/src/zero_copy/mod.rs#L70"><code>Mode</code></a> trait defined? Just to focus on the GAT, the trait look likes this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">trait</span><span class="w"> </span><span class="n">Mode</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Output</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Here, the <code>T</code> represents the result type of &ldquo;some parser parsed in this mode&rdquo;. GATs thus allow us to define a <a href="https://github.com/zesterer/chumsky/blob/6a82f90ae4c1a4564e024eb0f63121fc7b7d3c18/src/zero_copy/mod.rs#L70"><code>Mode</code></a> that is <strong>independent</strong> from any particular <code>Parser</code>. There are two impls of <a href="https://github.com/zesterer/chumsky/blob/6a82f90ae4c1a4564e024eb0f63121fc7b7d3c18/src/zero_copy/mod.rs#L70"><code>Mode</code></a> (also internal to chumsky):</p>
<ul>
<li><a href="https://github.com/zesterer/chumsky/blob/6a82f90ae4c1a4564e024eb0f63121fc7b7d3c18/src/zero_copy/mod.rs#L115-L117"><code>Check</code></a>, defined like <code>struct Check; impl Mode for Check { type Output&lt;T&gt; = (); ... }</code>. In other words, no matter what parser you use, <code>Check</code> just builds a <code>()</code> result (success or failure is propagated inepdendently of the mode).</li>
<li><a href="https://github.com/zesterer/chumsky/blob/6a82f90ae4c1a4564e024eb0f63121fc7b7d3c18/src/zero_copy/mod.rs#L87-L89"><code>Emit</code></a>, defined like <code>struct Emit; impl Mode for Emit { type Output&lt;T&gt; = T; ... }</code>.  In <code>Emit</code> mode, the output is exactly what the parser generated.</li>
</ul>
<p>Note that you could, in theory, produce other modes. For example, a <code>Count</code> mode that not only computes success/failure but counts the number of nodes parsed, or perhaps a mode that computes hashes of the resulting parsed value. Moreover, you could add these modes (and the defaulted methods in <code>Parser</code>) <strong>without breaking any clients</strong>.</p>
<h3 id="how-could-you-model-this-today">How could you model this today?</h3>
<p>I was trying to think how one might model this problem with traits today. All the options I came up with had significant downsides.</p>
<p><strong>Multiple functions on the trait, or multiple traits.</strong> One obvious option would be to use multiple functions in the parse trait, or multiple traits:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// Multiple functions
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Parser</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="k">fn</span> <span class="nf">parse</span><span class="p">();</span><span class="w"> </span><span class="k">fn</span> <span class="nf">check</span><span class="p">();</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// Multiple traits
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Parser</span>: <span class="nc">Checker</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="k">fn</span> <span class="nf">parse</span><span class="p">();</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Checker</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="k">fn</span> <span class="nf">check</span><span class="p">();</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Both of these approaches mean that defining a new combinator requires writing the same logic twice, once for parse and once for check, but with small variations, which is both annoying and a great opportunity for bugs. It also means that if chumsky ever wanted to define a new mode, they would have to modify every implementation of <code>Parser</code> (a breaking change, to boot).</p>
<p><strong>Mode with a type parameter.</strong> You could try defining a the mode trait with a type parameter, like so&hellip;</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">ModeFor</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Output</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The <code>go</code> function would then look like</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">go</span><span class="o">&lt;</span><span class="n">M</span>: <span class="nc">ModeFor</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Output</span><span class="o">&gt;&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">inp</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">InputRef</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="p">,</span><span class="w"> </span><span class="nb">&#39;_</span><span class="p">,</span><span class="w"> </span><span class="n">I</span><span class="p">,</span><span class="w"> </span><span class="n">E</span><span class="p">,</span><span class="w"> </span><span class="n">S</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">PResult</span><span class="o">&lt;</span><span class="n">M</span><span class="p">,</span><span class="w"> </span><span class="bp">Self</span>::<span class="n">Output</span><span class="p">,</span><span class="w"> </span><span class="n">E</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">Self</span>: <span class="nb">Sized</span><span class="p">;</span><span class="w">
</span></span></span></code></pre></div><p>In practice, though, this doesn&rsquo;t really work, for a number of reasons. One of them is that the <a href="https://github.com/zesterer/chumsky/blob/6a82f90ae4c1a4564e024eb0f63121fc7b7d3c18/src/zero_copy/mod.rs#L70"><code>Mode</code></a> trait includes methods like <a href="https://github.com/zesterer/chumsky/blob/6a82f90ae4c1a4564e024eb0f63121fc7b7d3c18/src/zero_copy/mod.rs#L74-L78"><code>combine</code></a>, which take the output of many parsers, not just one, and combine them together. Good luck writing that constraint with <code>ModeFor</code>. But even ignoring that, lacking HRTB, the signature of <code>go</code> itself is incomplete. The problem is that, given some impl of <code>Parser</code> for some parser type <code>MyParser</code>, <code>MyParser</code> only knows that <code>M</code> is a valid mode for its particular output. But maybe <code>MyParser</code> plans to (internally) use some other parser combinators that produce different kinds of results. Will the mode <code>M</code> still apply to those? We don&rsquo;t know. We&rsquo;d have to be able to write a HRTB like <code>for&lt;O&gt; Mode&lt;O&gt;</code>, which Rust doesn&rsquo;t support yet:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">go</span><span class="o">&lt;</span><span class="n">M</span>: <span class="nc">for</span><span class="o">&lt;</span><span class="n">O</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Mode</span><span class="o">&lt;</span><span class="n">O</span><span class="o">&gt;&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">inp</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">InputRef</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="p">,</span><span class="w"> </span><span class="nb">&#39;_</span><span class="p">,</span><span class="w"> </span><span class="n">I</span><span class="p">,</span><span class="w"> </span><span class="n">E</span><span class="p">,</span><span class="w"> </span><span class="n">S</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">PResult</span><span class="o">&lt;</span><span class="n">M</span><span class="p">,</span><span class="w"> </span><span class="bp">Self</span>::<span class="n">Output</span><span class="p">,</span><span class="w"> </span><span class="n">E</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">Self</span>: <span class="nb">Sized</span><span class="p">;</span><span class="w">
</span></span></span></code></pre></div><p>But even if Rust <em>did</em> support it, you can see that the <code>Mode&lt;T&gt;</code> trait doesn&rsquo;t capture the user&rsquo;s intent as closely as the <a href="https://github.com/zesterer/chumsky/blob/6a82f90ae4c1a4564e024eb0f63121fc7b7d3c18/src/zero_copy/mod.rs#L70"><code>Mode</code></a> trait from Chumsky did. The <a href="https://github.com/zesterer/chumsky/blob/6a82f90ae4c1a4564e024eb0f63121fc7b7d3c18/src/zero_copy/mod.rs#L70"><code>Mode</code></a> trait was defined independently from all parsers, which is what we wanted. The <code>Mode&lt;T&gt;</code> trait is defined relative to some specific parser, and then it falls to the <code>go</code> function to say &ldquo;oh, I want this to be a mode for <em>all</em> parsers&rdquo; using a HRTB.</p>
<p>Using just HRTB (which, again, Rust doesn&rsquo;t have), you could define <em>another</em> trait&hellip;</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Mode</span>: <span class="nc">for</span><span class="o">&lt;</span><span class="n">O</span><span class="o">&gt;</span><span class="w"> </span><span class="n">ModeFor</span><span class="o">&lt;</span><span class="n">O</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">ModeFor</span><span class="o">&lt;</span><span class="n">O</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{}</span><span class="w">
</span></span></span></code></pre></div><p>&hellip;which would allow us to write <code>M: Mode</code> on <code>go</code> against, but it&rsquo;s hard to argue this is <em>simpler</em> than the original GAT variety. This extra <code>ModeFor</code> trait has a &ldquo;code smell&rdquo; to it, it&rsquo;s hard to understand why it is there. Whereas before, you implemented the <a href="https://github.com/zesterer/chumsky/blob/6a82f90ae4c1a4564e024eb0f63121fc7b7d3c18/src/zero_copy/mod.rs#L70"><code>Mode</code></a> trait in just the way you think about it, with a single impl that applies to all parsers&hellip;</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Mode</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Check</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Output</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>&hellip;you now write an impl of <code>ModeFor</code>, where one &ldquo;instance&rdquo; of the impl applies to only one parser (which has output type <code>O</code>). It feels indirect:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">O</span><span class="o">&gt;</span><span class="w"> </span><span class="n">ModeFor</span><span class="o">&lt;</span><span class="n">O</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Check</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="how-could-you-model-this-with-rpitit">How could you model this with RPITIT?</h3>
<p>It&rsquo;s also been proposed that we should keep GATs, but only as an implementation detail for things like return position impl Trait in traits (RPITIT) or async functions. This implies that we could model the &ldquo;many modes&rdquo; pattern with RPITIT. If you look at the <a href="https://github.com/zesterer/chumsky/blob/6a82f90ae4c1a4564e024eb0f63121fc7b7d3c18/src/zero_copy/mod.rs#L70"><code>Mode</code></a> trait, though, you&rsquo;ll see that this simply doesn&rsquo;t work. Consider the <code>combine</code> method, which takes the results from two parsers and combines them to form a new result:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">combine</span><span class="o">&lt;</span><span class="n">T</span><span class="p">,</span><span class="w"> </span><span class="n">U</span><span class="p">,</span><span class="w"> </span><span class="n">V</span><span class="p">,</span><span class="w"> </span><span class="n">F</span>: <span class="nb">FnOnce</span><span class="p">(</span><span class="n">T</span><span class="p">,</span><span class="w"> </span><span class="n">U</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">V</span><span class="o">&gt;</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">x</span>: <span class="nc">Self</span>::<span class="n">Output</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">y</span>: <span class="nc">Self</span>::<span class="n">Output</span><span class="o">&lt;</span><span class="n">U</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">f</span>: <span class="nc">F</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Output</span><span class="o">&lt;</span><span class="n">V</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span></code></pre></div><p>How could we write this in terms of a function that returns <code>impl Trait</code>?</p>
<h3 id="other-patterns">Other patterns</h3>
<p>In this post, I went through the chumsky pattern in detail. I&rsquo;ve not had time to dive quite as deep into other examples, but I&rsquo;ve been reading through them and trying to extract out patterns. Here are a few patterns I extracted so far:</p>
<ul>
<li>The &ldquo;generic scopes&rdquo; pattern (<a href="https://github.com/rust-lang/rust/pull/96709#issuecomment-1120354039">smithay</a>, <a href="https://play.rust-lang.org/?version=nightly&amp;mode=debug&amp;edition=2021&amp;gist=a23b6a846aa1a506c199f7792e1abd3e">playground</a>):
<ul>
<li>In the Smithay API, if you have some variable <code>r: R</code> where <code>R: Renderer</code>, you can invoke <code>r.render(|my_frame| ...)</code>. This will invoke your callback with some frame <code>my_frame</code> that you can then modify. The thing is that the type of <code>my_frame</code> depends on the type of renderer that you have; moreover, frames often include thread-local data and so should only be accessible to during that callback.</li>
<li>I called this the &ldquo;generic scopes&rdquo; pattern because, at least from a types POV, it is kind of a generic version of APIs like <a href="https://doc.rust-lang.org/std/thread/fn.scope.html"><code>std::thread::scope</code></a>. The <code>scope</code> function also uses a callback to give limited access to a variable (the &ldquo;thread scope&rdquo;), but in the case of <code>std::thread::scope</code>, the type of that scope is hard-coded to be <a href="https://doc.rust-lang.org/std/thread/struct.Scope.html"><code>std::thread::Scope</code></a>, whereas here, we want the specific type to depend on the renderer.</li>
<li>Thanks to GATs, you can express that pretty cleanly, so that the only bound you need is <code>R: Renderer</code>. As with &ldquo;many modes&rdquo;, if you tried to express it using features today, you can get part of the way there, but the bounds will be complex and involve HRTB.</li>
</ul>
</li>
<li>The &ldquo;pointer types&rdquo; pattern:
<ul>
<li>I didn&rsquo;t dig deep enough into Pythonesque&rsquo;s hypotheticals, but <a href="https://github.com/rust-lang/rust/pull/96709#issuecomment-1150127168">this comment</a> seemed to be describing a desire to talk about &ldquo;pointer types&rdquo; in the abstract, which is definitely a common need; looking at <a href="https://github.com/amethyst/specs/blob/master/src/storage/generic.rs#L114-L150">the comits from Veloren</a> that pythonesque also cited, this might be a kind of &ldquo;pointer types&rdquo; pattern, but I think I might also call it &ldquo;many modes&rdquo;.</li>
</ul>
</li>
<li>The &ldquo;iterable&rdquo; pattern:
<ul>
<li>In this pattern, you would like a way to say <code>where C: Iterable</code>, meaning that <code>C</code> is a collection with an <code>iter</code> method which fits the signature <code>fn iter(&amp;self) -&gt; impl Iterator&lt;Item = &amp;T&gt;</code>. This is distinct from <code>IntoIterator</code> because it takes <code>&amp;self</code> and thus we can iterate over the same collection many times and concurrently.</li>
<li>The most common workaround is to return a <code>Box&lt;dyn&gt;</code> (as in <a href="https://github.com/Emoun/graphene/issues/7">graphene</a>) or a collection (<a href="https://github.com/metamolecular/gamma/issues/8">as in metamolectular</a>). Neither is zero-cost, which <a href="https://github.com/rust-lang/rust/pull/96709#issuecomment-1120175346">can be a problem in tight loops, as commented here</a>. You can also use HRTB (as <a href="https://doc.rust-lang.org/nightly/nightly-rustc/rustc_data_structures/graph/trait.WithSuccessors.html">rustc does</a>, which is complex and leaky.</li>
</ul>
</li>
</ul>
<h3 id="did-i-miss-something">Did I miss something?</h3>
<p>Maybe you see a way to express the &ldquo;many modes&rdquo; pattern (or one of the other patterns I cited) in Rust today that works well? Let me know by commenting on the thread.</p>
<p>(Since posting this, it occurs to me that one could probably use procedural macros to achieve some similar goals, though I think this approach would also have significant downsides.)</p>
]]></content></entry><entry><title type="html">What it feels like when Rust saves your bacon</title><link href="https://smallcultfollowing.com/babysteps/blog/2022/06/15/what-it-feels-like-when-rust-saves-your-bacon/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2022/06/15/what-it-feels-like-when-rust-saves-your-bacon/</id><published>2022-06-15T00:00:00+00:00</published><updated>2022-06-15T19:34:00-04:00</updated><content type="html"><![CDATA[<p>You&rsquo;ve probably heard that the Rust type checker can be a great &ldquo;co-pilot&rdquo;, helping you to avoid subtle bugs that would have been a royal pain in the !@#!$! to debug. This is truly awesome! But what you may not realize is how it feels <em>in the moment</em> when this happens. The answer typically is: <strong>really, really frustrating!</strong> Usually, you are trying to get some code to compile and you find you just can&rsquo;t do it.</p>
<p>As you come to learn Rust better, and especially to gain a bit of a deeper understanding of what is happening when your code runs, you can start to see when you are getting a type-check error because you have a typo versus because you are trying to do something fundamentally flawed.</p>
<p>A couple of days back, I had a moment where the compiler caught a really subtle bug that would&rsquo;ve been horrible had it been allowd to compile. I thought it would be fun to narrate a bit how it played out, and also take the moment to explain a bit more about temporaries in Rust (a common source of confusion, in my observations).</p>
<h2 id="code-available-in-this-repository">Code available in this repository</h2>
<p>All the code for this blog post is available in a <a href="https://github.com/nikomatsakis/2022-06-15-blogpost/">github repository</a>.</p>
<h2 id="setting-the-scene-lowering-the-ast">Setting the scene: lowering the AST</h2>
<p>In the compiler, we first represent Rust programs using an <a href="https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/ast/index.html">Abstract Syntax Tree (AST)</a>. I&rsquo;ve prepared a <a href="https://github.com/nikomatsakis/2022-06-15-blogpost/blob/f280f91e9be03d37f273acf13502ef7dc1015db8/examples/a.rs">standalone example</a> that shows roughly how the code looks today (of course the real thing is a lot more complex). The AST in particular is found in the <a href="https://github.com/nikomatsakis/2022-06-15-blogpost/blob/f280f91e9be03d37f273acf13502ef7dc1015db8/examples/a.rs#L4">ast module</a> containing various data structures that map closely to Rust syntax. So for example we have a <code>Ty</code> type that represents Rust types:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">enum</span> <span class="nc">Ty</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">ImplTrait</span><span class="p">(</span><span class="n">TraitRef</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">NamedType</span><span class="p">(</span><span class="nb">String</span><span class="p">,</span><span class="w"> </span><span class="nb">Vec</span><span class="o">&lt;</span><span class="n">Ty</span><span class="o">&gt;</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// ...
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Lifetime</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// ...
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The <code>impl Trait</code> notation references a <a href="https://github.com/nikomatsakis/2022-06-15-blogpost/blob/f280f91e9be03d37f273acf13502ef7dc1015db8/examples/a.rs#L12-L15"><code>TraitRef</code></a>, which stores the <code>Trait</code> part of things:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">TraitRef</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="n">trait_name</span>: <span class="nb">String</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="n">parameters</span>: <span class="nc">Parameters</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">enum</span> <span class="nc">Parameters</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">AngleBracket</span><span class="p">(</span><span class="nb">Vec</span><span class="o">&lt;</span><span class="n">Parameter</span><span class="o">&gt;</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Parenthesized</span><span class="p">(</span><span class="nb">Vec</span><span class="o">&lt;</span><span class="n">Ty</span><span class="o">&gt;</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">enum</span> <span class="nc">Parameter</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Ty</span><span class="p">(</span><span class="n">Ty</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Lifetime</span><span class="p">(</span><span class="n">Lifetime</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Note that the parameters of the trait come in two varieties, angle-bracket (e.g., <code>impl PartialEq&lt;T&gt;</code> or <code>impl MyTrait&lt;'a, U&gt;</code>) and parenthesized (e.g., <code>impl FnOnce(String, u32)</code>). These two are slightly different &ndash; parenthesized parameters, for example, only accept types, whereas angle-bracket accept types or lifetimes.</p>
<p>After parsing, this AST gets translated to something called <a href="https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/hir/index.html">High-level Intermediate Representation (HIR)</a> through a process called <em>lowering</em>. The snippet doesn&rsquo;t include the HIR, but it includes a number of methods like <a href="https://github.com/nikomatsakis/2022-06-15-blogpost/blob/8232a6ee30e92faa2117dd23ee28d5c145509d92/examples/a.rs#L98-L116"><code>lower_ty</code></a> that take as input an AST type and produce the HIR type:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Context</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">lower_ty</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">ty</span>: <span class="kp">&amp;</span><span class="nc">ast</span>::<span class="n">Ty</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">hir</span>::<span class="n">Ty</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">match</span><span class="w"> </span><span class="n">ty</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// ... lots of stuff here
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// A type like `impl Trait`
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">ast</span>::<span class="n">Ty</span>::<span class="n">ImplTrait</span><span class="p">(</span><span class="n">trait_ref</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="n">do_something_with</span><span class="p">(</span><span class="n">trait_ref</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// A type like `Vec&lt;T&gt;`, where `Vec` is the name and
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// `[T]` are the `parameters`
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">ast</span>::<span class="n">Ty</span>::<span class="n">NamedType</span><span class="p">(</span><span class="n">name</span><span class="p">,</span><span class="w"> </span><span class="n">parameters</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="k">for</span><span class="w"> </span><span class="n">parameter</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="n">parameters</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                    </span><span class="bp">self</span><span class="p">.</span><span class="n">lower_ty</span><span class="p">(</span><span class="n">parameter</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// ...
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Each method is defined on this <code>Context</code> type that carries some common state, and the methods tend to call one another. For example, <a href="https://github.com/nikomatsakis/2022-06-15-blogpost/blob/f280f91e9be03d37f273acf13502ef7dc1015db8/examples/a.rs#L57-L65"><code>lower_signature</code></a> invokes <a href="https://github.com/nikomatsakis/2022-06-15-blogpost/blob/8232a6ee30e92faa2117dd23ee28d5c145509d92/examples/a.rs#L98-L116"><code>lower_ty</code></a> on all of the input (argument) types and on the output (return) type:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Context</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">lower_signature</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">sig</span>: <span class="kp">&amp;</span><span class="nc">ast</span>::<span class="n">Signature</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">hir</span>::<span class="n">Signature</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">for</span><span class="w"> </span><span class="n">input</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="o">&amp;</span><span class="n">sig</span><span class="p">.</span><span class="n">inputs</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="bp">self</span><span class="p">.</span><span class="n">lower_ty</span><span class="p">(</span><span class="n">input</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">lower_ty</span><span class="p">(</span><span class="o">&amp;</span><span class="n">sig</span><span class="p">.</span><span class="n">output</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="our-story-begins">Our story begins</h2>
<p><a href="https://github.com/spastorino/">Santiago Pastorino</a> is working on a refactoring to make it easier to support returning <code>impl Trait</code> values from trait functions. As part of that, he needs to collect all the <code>impl Trait</code> types that appear in the function arguments. The challenge is that these types can appear anywhere, and not just at the top level. In other words, you might have <code>fn foo(x: impl Debug)</code>, but you might also have <code>fn foo(x: Box&lt;(impl Debug, impl Debug)&gt;)</code>. Therefore, we decided it would make sense to add a vector to <code>Context</code> and have <code>lower_ty</code> collect the <code>impl Trait</code> types into it. That way, we can find the complete set.</p>
<p>To do this, we started by adding the vector into this <code>Context</code>. We&rsquo;ll store the <code>TraitRef</code> from each <code>impl Trait</code> type:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Context</span><span class="o">&lt;</span><span class="na">&#39;ast</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">saved_impl_trait_types</span>: <span class="nb">Vec</span><span class="o">&lt;&amp;</span><span class="na">&#39;ast</span><span class="w"> </span><span class="n">ast</span>::<span class="n">TraitRef</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// ...
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>To do this, we had to add a new lifetime parameter, <code>'ast</code>, which is meant to represent the lifetime of the AST structure itself. In other words, <code>saved_impl_trait_types</code> stores references into the AST. Of course, once we did this, the compiler got upset and we had to go modify the <code>impl</code> block that references <code>Context</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="na">&#39;ast</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Context</span><span class="o">&lt;</span><span class="na">&#39;ast</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now we can modify the <code>lower_ty</code> to push the trait ref into the vector:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="na">&#39;ast</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Context</span><span class="o">&lt;</span><span class="na">&#39;ast</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">lower_ty</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">ty</span>: <span class="kp">&amp;</span><span class="nc">ast</span>::<span class="n">Ty</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">match</span><span class="w"> </span><span class="n">ty</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">ast</span>::<span class="n">Ty</span>::<span class="n">ImplTrait</span><span class="p">(</span><span class="o">..</span><span class="p">.)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="c1">// 👇 push the types into the vector 👇
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="bp">self</span><span class="p">.</span><span class="n">saved_impl_trait_types</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="n">ty</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="n">do_something</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">ast</span>::<span class="n">Ty</span>::<span class="n">NamedType</span><span class="p">(</span><span class="n">name</span><span class="p">,</span><span class="w"> </span><span class="n">parameters</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="c1">// just like before
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>At this point, the compiler gives us an error:</p>
<pre tabindex="0"><code>error[E0621]: explicit lifetime required in the type of `ty`
   --&gt; examples/b.rs:125:42
    |
119 |     fn lower_ty(&amp;mut self, ty: &amp;ast::Ty) -&gt; hir::Ty {
    |                                -------- help: add explicit lifetime `&#39;ast` to the type of `ty`: `&amp;&#39;ast ast::Ty`
...
125 |                 self.impl_trait_tys.push(trait_ref);
    |                                          ^^^^^^^^^ lifetime `&#39;ast` required
</code></pre><p>Pretty nice error, actually! It&rsquo;s pointing out that we are pushing into this vector which needs references into &ldquo;the AST&rdquo;, but we haven&rsquo;t declared in our signature that the <code>ast::Ty</code> must actually from &ldquo;the AST&rdquo;. OK, let&rsquo;s fix this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="na">&#39;ast</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Context</span><span class="o">&lt;</span><span class="na">&#39;ast</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">lower_ty</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">ty</span>: <span class="kp">&amp;</span><span class="na">&#39;ast</span> <span class="nc">ast</span>::<span class="n">Ty</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// had to add &#39;ast here 👆, just like the error message said
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="propagating-lifetimes-everywhere">Propagating lifetimes everywhere</h2>
<p>Of course, now we start getting errors in the functions that <em>call</em> <code>lower_ty</code>. For example, <code>lower_signature</code> says:</p>
<pre tabindex="0"><code>error[E0621]: explicit lifetime required in the type of `sig`
  --&gt; examples/b.rs:71:18
   |
65 |     fn lower_signature(&amp;mut self, sig: &amp;ast::Signature) -&gt; hir::Signature {
   |                                        --------------- help: add explicit lifetime `&#39;ast` to the type of `sig`: `&amp;&#39;ast ast::Signature`
...
71 |             self.lower_ty(input);
   |                  ^^^^^^^^ lifetime `&#39;ast` required
</code></pre><p>The fix is the same. We tell the compiler that the <code>ast::Signature</code> is part of &ldquo;the AST&rdquo;, and that implies that the <code>ast::Ty</code> values owned by the <code>ast::Signature</code> are also part of &ldquo;the AST&rdquo;:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="na">&#39;ast</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Context</span><span class="o">&lt;</span><span class="na">&#39;ast</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">lower_signature</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">sig</span>: <span class="kp">&amp;</span><span class="na">&#39;ast</span> <span class="nc">ast</span>::<span class="n">Signature</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">hir</span>::<span class="n">Signature</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//        had to add &#39;ast here 👆, just like the error message said
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Great. This continues for a bit. But then&hellip; we hit this error:</p>
<pre tabindex="0"><code>error[E0597]: `parameters` does not live long enough
  --&gt; examples/b.rs:92:53
   |
58 | impl&lt;&#39;ast&gt; Context&lt;&#39;ast&gt; {
   |      ---- lifetime `&#39;ast` defined here
...
92 |                 self.lower_angle_bracket_parameters(&amp;parameters);
   |                 ------------------------------------^^^^^^^^^^^-
   |                 |                                   |
   |                 |                                   borrowed value does not live long enough
   |                 argument requires that `parameters` is borrowed for `&#39;ast`
93 |             }
   |             - `parameters` dropped here while still borrowed
</code></pre><p>What&rsquo;s this about?</p>
<h2 id="uh-oh">Uh oh&hellip;</h2>
<p>Jumping to that line, we see this function <a href="https://github.com/nikomatsakis/2022-06-15-blogpost/blob/f280f91e9be03d37f273acf13502ef7dc1015db8/examples/b.rs#L85-L97"><code>lower_trait_ref</code></a>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Context</span><span class="o">&lt;</span><span class="na">&#39;ast</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// ...
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">lower_trait_ref</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">trait_ref</span>: <span class="kp">&amp;</span><span class="na">&#39;ast</span> <span class="nc">ast</span>::<span class="n">TraitRef</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">hir</span>::<span class="n">TraitRef</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">match</span><span class="w"> </span><span class="o">&amp;</span><span class="n">trait_ref</span><span class="p">.</span><span class="n">parameters</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">ast</span>::<span class="n">Parameters</span>::<span class="n">AngleBracket</span><span class="p">(</span><span class="n">parameters</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="bp">self</span><span class="p">.</span><span class="n">lower_angle_bracket_parameters</span><span class="p">(</span><span class="o">&amp;</span><span class="n">parameters</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">ast</span>::<span class="n">Parameters</span>::<span class="n">Parenthesized</span><span class="p">(</span><span class="n">types</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="kd">let</span><span class="w"> </span><span class="n">parameters</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">_</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">types</span><span class="p">.</span><span class="n">iter</span><span class="p">().</span><span class="n">cloned</span><span class="p">().</span><span class="n">map</span><span class="p">(</span><span class="n">ast</span>::<span class="n">Parameter</span>::<span class="n">Ty</span><span class="p">).</span><span class="n">collect</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="bp">self</span><span class="p">.</span><span class="n">lower_angle_bracket_parameters</span><span class="p">(</span><span class="o">&amp;</span><span class="n">parameters</span><span class="p">);</span><span class="w"> </span><span class="c1">// 👈 error is on this line
</span></span></span><span class="line"><span class="cl"><span class="w">                
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">hir</span>::<span class="n">TraitRef</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// ...
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>So what&rsquo;s this about? Well, the <em>purpose</em> of this code is a bit clever. As we saw before, Rust has two syntaxes for trait-refs, you can use parentheses like <code>FnOnce(u32)</code>, in which case you only have types, or you can use angle brackets like <code>Foo&lt;'a, u32&gt;</code>, in which case you could have either lifetimes <em>or</em> types. So this code is normalizing to the angle-bracket notation, which is more general, and then using the same lowering helper function.</p>
<h2 id="wait-right-there-that-was-the-moment">Wait! Right there! That was the moment!</h2>
<p>What?</p>
<h2 id="that-was-the-moment-that-rust-saved-you-a-world-of-pain">That was the moment that Rust saved you a world of pain!</h2>
<p>It was? It just kind of seemed like an annoying, and I will say, kind of confusing compilation error. What the heck is going on? The problem here is that <code>parameters</code> is a local variable. It is going to be freed as soon as <code>lower_trait_ref</code> returns. But it could happen that <code>lower_trait_ref</code> calls <code>lower_ty</code> which takes a reference to the type and stores it into the <code>saved_impl_trait_types</code> vector. Then, later, some code would try to use that reference, and access freed memory. That would sometimes work, but often not &ndash; and if you forgot to test with parenthesized trait refs, the code would work fine for ever, so you&rsquo;d never even notice.</p>
<h2 id="how-to-fix-it">How to fix it</h2>
<p>Maybe you&rsquo;re wondering: great, Rust saved me a world of pain, but how do I fix it? Do I just have to copy the <code>lower_angle_bracket_parameters</code> and have two copies? &lsquo;Cause that&rsquo;s kind of unfortunate.</p>
<p>Well, there are a variety of ways you <em>might</em> fix it. One of them is to use an <em>arena</em>, like the <a href="https://crates.io/crates/typed-arena"><code>typed-arena</code></a> crate. An arena is a memory pool. Instead of storing the temporary <code>Vec&lt;Parameter&gt;</code> vector on the stack, we&rsquo;ll put it in an arena, and that way it will live for the entire time that we are lowering things. <a href="https://github.com/nikomatsakis/2022-06-15-blogpost/blob/f280f91e9be03d37f273acf13502ef7dc1015db8/examples/c.rs">Example C</a> in the repo takes this approach. It starts by adding the <code>arena</code> field to the <a href="https://github.com/nikomatsakis/2022-06-15-blogpost/blob/f280f91e9be03d37f273acf13502ef7dc1015db8/examples/c.rs#L54-L60"><code>Context</code></a>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Context</span><span class="o">&lt;</span><span class="na">&#39;ast</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">impl_trait_tys</span>: <span class="nb">Vec</span><span class="o">&lt;&amp;</span><span class="na">&#39;ast</span><span class="w"> </span><span class="n">ast</span>::<span class="n">TraitRef</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Holds temporary AST nodes that we create during lowering;
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// this can be dropped once lowering is complete.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">arena</span>: <span class="kp">&amp;</span><span class="na">&#39;ast</span> <span class="nc">typed_arena</span>::<span class="n">Arena</span><span class="o">&lt;</span><span class="nb">Vec</span><span class="o">&lt;</span><span class="n">ast</span>::<span class="n">Parameter</span><span class="o">&gt;&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This actually makes a subtle change to the meaning of <code>'ast</code>. It used to be that the only things with <code>'ast</code> lifetime were &ldquo;the AST&rdquo; itself, so having that lifetime implied being a part of the AST. But now that same lifetime is being used to tag the arena, too, so if we hae <code>&amp;'ast Foo</code> it means the data comes is owned by <em>either</em> the arena or the AST itself.</p>
<p><strong>Side note:</strong> despite the name lifetimes, which I now rather regret, more and more I tend to think of <em>lifetimes</em> like <code>'ast</code> in terms of &ldquo;who owns the data&rdquo;, which you can see in my description in the previous paragraph. You could instead think of <code>'ast</code> as a span of time (a &ldquo;lifetime&rdquo;), in which case it refers to the time that the <code>Context</code> type is valid, really, which must be a subset of the time that the arena is valid and the time that the AST itself is valid, since <code>Context</code> stores references to data owned by both of those.</p>
<p>Now we can rewrite <code>lower_trait_ref</code>  to call <code>self.arena.alloc()</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Context</span><span class="o">&lt;</span><span class="na">&#39;ast</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">lower_trait_ref</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">trait_ref</span>: <span class="kp">&amp;</span><span class="na">&#39;ast</span> <span class="nc">ast</span>::<span class="n">TraitRef</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">hir</span>::<span class="n">TraitRef</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">match</span><span class="w"> </span><span class="o">&amp;</span><span class="n">trait_ref</span><span class="p">.</span><span class="n">parameters</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// ...
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">ast</span>::<span class="n">Parameters</span>::<span class="n">Parenthesized</span><span class="p">(</span><span class="n">types</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="kd">let</span><span class="w"> </span><span class="n">parameters</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">_</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">types</span><span class="p">.</span><span class="n">iter</span><span class="p">().</span><span class="n">cloned</span><span class="p">().</span><span class="n">map</span><span class="p">(</span><span class="n">ast</span>::<span class="n">Parameter</span>::<span class="n">Ty</span><span class="p">).</span><span class="n">collect</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="kd">let</span><span class="w"> </span><span class="n">parameters</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">arena</span><span class="p">.</span><span class="n">alloc</span><span class="p">(</span><span class="n">parameters</span><span class="p">);</span><span class="w"> </span><span class="c1">// 👈 added this line!
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="bp">self</span><span class="p">.</span><span class="n">lower_angle_bracket_parameters</span><span class="p">(</span><span class="n">parameters</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// ...
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now the <code>parameters</code> variable is not stored on the stack but allocated in the arena; the arena has <code>'ast</code> lifetime, so that&rsquo;s fine, and everything works!</p>
<h2 id="calling-the-lowering-code-and-creating-the-context">Calling the lowering code and creating the context</h2>
<p>Now that we added, the arena, creating the context will look a bit different. It&rsquo;ll look something like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">arena</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">TypedArena</span>::<span class="n">new</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">context</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Context</span>::<span class="n">new</span><span class="p">(</span><span class="o">&amp;</span><span class="n">arena</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">hir_signature</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">context</span><span class="p">.</span><span class="n">lower_signature</span><span class="p">(</span><span class="o">&amp;</span><span class="n">signature</span><span class="p">);</span><span class="w">
</span></span></span></code></pre></div><p>The nice thing about this is that, once we are done with lowering, the <code>context</code> will be dropped and all those temporary nodes will be freed.</p>
<h2 id="another-way-to-fix-it">Another way to fix it</h2>
<p>The other obvious option is to avoid lifetimes altogether and just &ldquo;clone all the things&rdquo;. Given that the AST is immutable once constructed, you can just clone them into the vector:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Context</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">impl_trait_tys</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">ast</span>::<span class="n">TraitRef</span><span class="o">&gt;</span><span class="p">,</span><span class="w"> </span><span class="c1">// just clone it!
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>If that clone is too expensive (possible), then use <code>Rc&lt;ast::TraitRef&gt;</code> or <code>Arc&lt;ast::TraitRef&gt;</code> (this will require deep-ish changes to the AST to put all the things into <code>Rc</code> or <code>Arc</code> that might need to be individually referenced). At this point you&rsquo;ve got a feeling a lot like garbage collection (if less ergonomic).</p>
<h2 id="yet-another-way">Yet another way</h2>
<p>The way I tend to write compilers these days is to use the &ldquo;indices as pointers&rdquo;. In this approach, all the data in the AST is stored in vectors, and references between things use indices, kind of like <a href="http://smallcultfollowing.com/babysteps/blog/2015/04/06/modeling-graphs-in-rust-using-vector-indices/">I described here</a>.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Compilation errors are pretty frustrating, but they may also be a sign that the compiler is protecting us from ourselves. In this case, when we embarked on this refactoring, I was totally sure it was going to work fine, because I didn&rsquo;t realize we ever created &ldquo;temporary AST&rdquo; nodes, so I assumed that all the data was owned by the original AST. In a language like C or C++, it would have been <em>very</em> easy to have a bug here, and it would have been a horrible pain to find. With Rust, that&rsquo;s not a problem.</p>
<p>Of course, not everything is great. For me, doing these kinds of lifetime transformations is old-hat. But for many people it&rsquo;s pretty non-obvious how to start when the compiler is giving you error messages. When people come to me for help, the first thing I try to do is to suss out: what are the ownership relationships, and where do we expect these references to be coming form? There&rsquo;s also various heuristics that I use to decide: do we need a new lifetime parameter? Can we re-use an existing one? I&rsquo;ll try to write up more stories like this to clarify that side of things. Honestly, my main point here was that I was just so grateful that Rust prevented us from spending hours and hours debugging a subtle crash!</p>
<p>Looking forward a bit, I see a lot of potential to improve things about our notation and terminology. I think we should be able to make cases like this one much slicker, hopefully without requiring named lifetime parameters and so forth, or as many edits. But I admit I don&rsquo;t yet know how to do it! :) My plan for now is to keep an eye out for the tricks I am using and the kinds of analysis I am doing in my head and write out blog posts like this one to capture those narratives. I encourage those of you who know Rust well (or who don&rsquo;t!) to do the same.</p>
<h2 id="appendix-why-not-have-context-own-the-typedarena">Appendix: why not have <code>Context</code> <em>own</em> the <code>TypedArena</code>?</h2>
<p>You may have noticed that using the arena had a kind of annoying consequence: people who called <code>Context::new</code> now had to create and supply an area:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">arena</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">TypedArena</span>::<span class="n">new</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">context</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Context</span>::<span class="n">new</span><span class="p">(</span><span class="o">&amp;</span><span class="n">arena</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">hir_signature</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">context</span><span class="p">.</span><span class="n">lower_signature</span><span class="p">(</span><span class="o">&amp;</span><span class="n">signature</span><span class="p">);</span><span class="w">
</span></span></span></code></pre></div><p>This is because <code>Context&lt;'ast&gt;</code> stores a <code>&amp;'ast TypedArena&lt;_&gt;</code>, and so the caller must create the arena. If we modified <code>Context</code> to <em>own</em> the arena, then the API could be better. So why didn&rsquo;t I do that? To see why, check out <a href="https://github.com/nikomatsakis/2022-06-15-blogpost/blob/f280f91e9be03d37f273acf13502ef7dc1015db8/examples/c.rs">example D</a> (which doesn&rsquo;t build). In that example, the <code>Context</code> looks like&hellip;</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Context</span><span class="o">&lt;</span><span class="na">&#39;ast</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">impl_trait_tys</span>: <span class="nb">Vec</span><span class="o">&lt;&amp;</span><span class="na">&#39;ast</span><span class="w"> </span><span class="n">ast</span>::<span class="n">TraitRef</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Holds temporary AST nodes that we create during lowering;
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// this can be dropped once lowering is complete.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">arena</span>: <span class="nc">typed_arena</span>::<span class="n">Arena</span><span class="o">&lt;</span><span class="nb">Vec</span><span class="o">&lt;</span><span class="n">ast</span>::<span class="n">Parameter</span><span class="o">&gt;&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>You then have to change the signatures of each function to take an <code>&amp;'ast mut self</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Context</span><span class="o">&lt;</span><span class="na">&#39;ast</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">lower_signature</span><span class="p">(</span><span class="o">&amp;</span><span class="na">&#39;ast</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">sig</span>: <span class="kp">&amp;</span><span class="na">&#39;ast</span> <span class="nc">ast</span>::<span class="n">Signature</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">hir</span>::<span class="n">Signature</span><span class="w"> </span><span class="p">{</span><span class="o">..</span><span class="p">.}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This is saying: the <code>'ast</code> parameter might refer to data owned by self, or maybe by sig. Seems sensible, but if you try to build <a href="https://github.com/nikomatsakis/2022-06-15-blogpost/blob/f280f91e9be03d37f273acf13502ef7dc1015db8/examples/c.rs">Example D</a>, though, you get lots of errors. Here is one of the most interesting to me:</p>
<pre tabindex="0"><code>error[E0502]: cannot borrow `*self` as mutable because it is also borrowed as immutable
  --&gt; examples/d.rs:98:17
   |
62 | impl&lt;&#39;ast&gt; Context&lt;&#39;ast&gt; {
   |      ---- lifetime `&#39;ast` defined here
...
97 |                 let parameters = self.arena.alloc(parameters);
   |                                  ----------------------------
   |                                  |
   |                                  immutable borrow occurs here
   |                                  argument requires that `self.arena` is borrowed for `&#39;ast`
98 |                 self.lower_angle_bracket_parameters(parameters);
   |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ mutable borrow occurs here
</code></pre><p>What is this all about? This is actually pretty subtle! This is saying that <code>parameters</code> was allocated from <code>self.arena</code>. That means that <code>parameters</code> will be valid <strong>as long as <code>self.arena</code> is valid</strong>.</p>
<p>But <code>self</code> is an <code>&amp;mut Context</code>, which means it can mutate any of the fields of the <code>Context</code>. When we call <code>self.lower_angle_bracket_parameters()</code>, it&rsquo;s entirely possible that <code>lower_angle_bracket_parameters</code> could mutate the arena:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">lower_angle_bracket_parameters</span><span class="p">(</span><span class="o">&amp;</span><span class="na">&#39;ast</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">parameters</span>: <span class="kp">&amp;</span><span class="na">&#39;ast</span> <span class="p">[</span><span class="n">ast</span>::<span class="n">Parameter</span><span class="p">])</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">self</span><span class="p">.</span><span class="n">arena</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">TypedArena</span>::<span class="n">new</span><span class="p">();</span><span class="w"> </span><span class="c1">// what if we did this?
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// ...
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Of course, the code doesn&rsquo;t do that now, but what if it did? The answer is that the parameters would be freed, because the arena that owns them is freed, and so we&rsquo;d have dead code. D&rsquo;oh!</p>
<p>All things considered, I&rsquo;d like to make it possible for <code>Context</code> to own the arena, but right now it&rsquo;s pretty challenging. This is a good example of code patterns we could enable, but it&rsquo;ll require language extensions.</p>
]]></content></entry><entry><title type="html">Async cancellation: a case study of pub-sub in mini-redis</title><link href="https://smallcultfollowing.com/babysteps/blog/2022/06/13/async-cancellation-a-case-study-of-pub-sub-in-mini-redis/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2022/06/13/async-cancellation-a-case-study-of-pub-sub-in-mini-redis/</id><published>2022-06-13T00:00:00+00:00</published><updated>2022-06-13T15:15:00-04:00</updated><content type="html"><![CDATA[<p>Lately I’ve been diving deep into tokio’s <a href="https://github.com/tokio-rs/mini-redis">mini-redis</a> example. The mini-redis example is a great one to look at because it&rsquo;s a realistic piece of quality async Rust code that is both self-contained and very well documented. Digging into mini-redis, I found that it exemplifies the best and worst of async Rust. On the one hand, the code itself is clean, efficient, and high-level. On the <em>other hand</em>, it relies on a number of subtle async conventions that can easily be done wrong &ndash; worse, if you do them wrong, you won&rsquo;t get a compilation error, and your code will &ldquo;mostly work&rdquo;, breaking only in unpredictable timing conditions that are unlikely to occur in unit tests. Just the kind of thing Rust tries to avoid! This isn&rsquo;t the fault of mini-redis &ndash; to my knowledge, there aren&rsquo;t great alterantive patterns available in async Rust today (I go through some of the alternatives in this post, and their downsides).</p>
<h2 id="context-evaluating-moro">Context: evaluating <a href="https://github.com/nikomatsakis/moro/">moro</a></h2>
<p>We&rsquo;ve heard from many users that async Rust has a number of pitfalls where things can break in subtle ways. In the Async Vision Doc, for example, the <a href="https://rust-lang.github.io/wg-async/vision/submitted_stories/status_quo/barbara_battles_buffered_streams.html">Barbara battles buffered streams</a> and <a href="https://rust-lang.github.io/wg-async/vision/submitted_stories/status_quo/aws_engineer/solving_a_deadlock.html">solving a deadlock</a> stories discuss challenges with <code>FuturesUnordered</code> (wrapped in the <code>buffered</code> combinator); the <a href="https://rust-lang.github.io/wg-async/vision/submitted_stories/status_quo/barbara_gets_burned_by_select.html">Barbara gets burned by select</a> and <a href="https://rust-lang.github.io/wg-async/vision/submitted_stories/status_quo/alan_builds_a_cache.html">Alan tries to cache requests, which doesn&rsquo;t always happen</a> stories talk about cancellation hazards and the <code>select!</code> or race combinators.</p>
<p>In response to these stories, I created an experimental project called <a href="https://github.com/nikomatsakis/moro/">moro</a> that explores structured concurrency in Rust. I&rsquo;ve not yet blogged about moro, and that&rsquo;s intentional. I&rsquo;ve been holding off until I gain more confidence in <a href="https://github.com/nikomatsakis/moro/">moro</a>&rsquo;s APIs. In the meantime, various people (including myself) have been porting different bits of code to <a href="https://github.com/nikomatsakis/moro/">moro</a> to get a better sense for what works and what doesn&rsquo;t. <a href="https://github.com/guswynn/">GusWynn</a>, for example, started changing bits of the <a href="https://github.com/MaterializeInc/">materialize.io codebase</a> to use moro and to have a safer alternative to cancellation. I&rsquo;ve been poking at mini-redis, and I&rsquo;ve also been working with some folks within AWS with some internal codebases.</p>
<p><strong>What I&rsquo;ve found so far is that <a href="https://github.com/nikomatsakis/moro/">moro</a> absolutely helps, but it&rsquo;s not enough.</strong> Therefore, instead of the triumphant blog post I had hoped for, I&rsquo;m writing this one, which does a kind of deep-dive into the patterns that <a href="https://github.com/tokio-rs/mini-redis">mini-redis</a> uses: both how they work well when done right, but also how they are tedious and error-prone. I&rsquo;ll be posting some follow-up blog posts that explore some of the ways that moro can help.</p>
<h2 id="what-is-mini-redis">What is mini-redis?</h2>
<p>If you’ve not seen it, <a href="https://github.com/tokio-rs/mini-redis">mini-redis</a> is a really cool bit of example code from the <a href="https://tokio.rs">tokio</a> project. It implements a “miniature” version of the <a href="https://redis.io">redis</a> in-memory data store, focusing on the key-value and pub-sub aspects of redis. Specifically, clients can connect to mini-redis and issue a subset of the redis commands. In this post, I’m going to focus on the “pub-sub” aspect of redis, in which clients can <strong>publish</strong> messages to a topic which are then broadcast to everyone who has <strong>subscribed</strong> to that topic. Whenever a client publishes a message, it receives in response the number of other clients that are currently subscribed to that topic.</p>
<p>Here is an example workflow involving two clients. Client 1 is subscribing to things, and Client 2 is publishing messages.</p>
<pre class="mermaid">sequenceDiagram
    Client1 ->> Server: subscribe `A`
    Client2 ->> Server: publish `foo` to `A`
    Server -->> Client2: 1 client is subscribed to `A`
    Server -->> Client1: `foo` was published to `A`
    Client1 ->> Server: subscribe `B`
    Client2 ->> Server: publish `bar` to `B`
    Server -->> Client2: 1 client is subscribed to `B`
    Server -->> Client1: `bar` was published to `B`
    Client1 ->> Server: unsubscribe A
    Client2 ->> Server: publish `baz` to `A`
    Server -->> Client2: 0 clients are subscribed to `A`
  </pre>
<h2 id="core-data-structures">Core data structures</h2>
<p>To implement this, the redis server maintains a struct <code>State</code> that is shared across all active clients. Since it is shared across all clients, it is maintained in a <code>Mutex</code> (<a href="https://github.com/tokio-rs/mini-redis/blob/cf1e4e465eceaaddd9497353e809fe6b814d7b19/src/db.rs#L52">source</a>):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Shared</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="sd">/// The shared state is guarded by a mutex. […]
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">state</span>: <span class="nc">Mutex</span><span class="o">&lt;</span><span class="n">State</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="err">…</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Within this <code>State</code> struct, there is a <code>pub_sub</code> field (<a href="https://github.com/tokio-rs/mini-redis/blob/cf1e4e465eceaaddd9497353e809fe6b814d7b19/src/db.rs#L66-L68">source</a>):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">pub_sub</span>: <span class="nc">HashMap</span><span class="o">&lt;</span><span class="nb">String</span><span class="p">,</span><span class="w"> </span><span class="n">broadcast</span>::<span class="n">Sender</span><span class="o">&lt;</span><span class="n">Bytes</span><span class="o">&gt;&gt;</span><span class="p">,</span><span class="w">
</span></span></span></code></pre></div><p>The <code>pub_sub</code> field stores a big hashmap. The key is the <em>topic</em> and the <em>value</em> is the <a href="https://docs.rs/tokio/latest/tokio/sync/broadcast/struct.Sender.html"><code>broadcast::Sender</code></a>, which is the “sender half” of a tokio broadcast channel. Whenever a client issues a <code>publish</code> command, it ultimately calls <a href="https://github.com/tokio-rs/mini-redis/blob/cf1e4e465eceaaddd9497353e809fe6b814d7b19/src/db.rs#L265-L278"><code>Db::publish</code></a>, which winds up invoking <code>send</code> on this broadcast channel:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="p">(</span><span class="k">crate</span><span class="p">)</span><span class="w"> </span><span class="k">fn</span> <span class="nf">publish</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">key</span>: <span class="kp">&amp;</span><span class="kt">str</span><span class="p">,</span><span class="w"> </span><span class="n">value</span>: <span class="nc">Bytes</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">usize</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">state</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">shared</span><span class="p">.</span><span class="n">state</span><span class="p">.</span><span class="n">lock</span><span class="p">().</span><span class="n">unwrap</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">state</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">.</span><span class="n">pub_sub</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="n">key</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// On a successful message send on the broadcast channel, the number
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// of subscribers is returned. An error indicates there are no
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// receivers, in which case, `0` should be returned.
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">.</span><span class="n">map</span><span class="p">(</span><span class="o">|</span><span class="n">tx</span><span class="o">|</span><span class="w"> </span><span class="n">tx</span><span class="p">.</span><span class="n">send</span><span class="p">(</span><span class="n">value</span><span class="p">).</span><span class="n">unwrap_or</span><span class="p">(</span><span class="mi">0</span><span class="p">))</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// If there is no entry for the channel key, then there are no
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// subscribers. In this case, return `0`.
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">.</span><span class="n">unwrap_or</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="the-subscriber-loop">The subscriber loop</h2>
<p>We just saw how, when clients publish data to a channel, that winds up invoking <code>send</code> on a broadcast channel. But how do the clients who are subscribed to that channel receive those messages? The answer lies in the <a href="https://github.com/tokio-rs/mini-redis/blob/cf1e4e465eceaaddd9497353e809fe6b814d7b19/src/cmd/subscribe.rs"><code>Subscribe</code></a> command.</p>
<p>The idea is that the server has a set <code>subscriptions</code> of subscribed channels for the client (<a href="https://github.com/tokio-rs/mini-redis/blob/cf1e4e465eceaaddd9497353e809fe6b814d7b19/src/cmd/subscribe.rs#L117">source</a>):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">subscriptions</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">StreamMap</span>::<span class="n">new</span><span class="p">();</span><span class="w">
</span></span></span></code></pre></div><p>This is implemented using a tokio <a href="https://docs.rs/tokio-stream/latest/tokio_stream/struct.StreamMap.html"><code>StreamMap</code></a>, which is a neato data structure that takes multiple streams which each yield up values of type <code>V</code>, gives each of them a key <code>K</code>, and combines them into one stream that yields up <code>(K, V)</code> pairs. In this case, the streams are the “receiver half” of those broadcast channels, and the keys are the channel names.</p>
<p>When it receives a subscribe command, then, the server wants to do the following:</p>
<ul>
<li>Add the receivers for each subscribed channel into <code>subscriptions</code>.</li>
<li>Loop:
<ul>
<li>If a message is published to <code>subscriptions</code>, then send it to the client.</li>
<li>If the client subscribes to new channels, add those to <code>subscriptions</code>  and send an acknowledgement to client.</li>
<li>If the client unsubscribes from some channels, remove them from <code>subscriptions</code> and send an acknowledgement to client.</li>
<li>If the client terminates, end the loop and close the connection.</li>
</ul>
</li>
</ul>
<h2 id="show-me-the-state">“Show me the state”</h2>
<p>Learning to write Rust code is basically an exercise in asking “show me the state” — i.e., the key to making Rust code work is knowing what data is going to be modified and when<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>. In this case, there are a few key pieces of state…</p>
<ul>
<li>The set <code>subscriptions</code> of “broadcast receivers” from each subscribed stream
<ul>
<li>There is also a set <code>self.channels</code> of “pending channel names” that ought to be subscribed to, though this is kind of an implementation detail and not essential.</li>
</ul>
</li>
<li>The connection <code>connection</code> used to communicate with the client (a TCP socket)</li>
</ul>
<p>And there are three concurrent tasks going on, each of which access that same state…</p>
<ul>
<li>Looking for published messages from <code>subscriptions</code> and forwarding to <code>connection</code> (reads <code>subscriptions</code>, writes to <code>connection</code>)</li>
<li>Reading client commands from <code>connection</code> and then either…
<ul>
<li>subscribing to new channels (writes to <code>subscriptions</code>) and sending a confirmation (writes to <code>connection</code>);</li>
<li>or unsubscribing from channels (writes to <code>subscriptions</code>) and sending a confirmation (writes to <code>connection</code>).</li>
</ul>
</li>
<li>Watching for termination and then cancelling everything (drops the broadcast handles in <code>connections</code>).</li>
</ul>
<p>You can start to see that this is going to be a challenge. There are three conceptual tasks, but they are each needing mutable access to the same data:</p>
<pre class="mermaid">flowchart LR
    forward["Forward published messages to client"]
    client["Process subscribe/unsubscribe messages from client"]
    terminate["Watch for termination"]
    
    subscriptions[("subscriptions:\nHandles from\nsubscribed channels")]
    connection[("connection:\nTCP stream\nto/from\nclient")]
    
    forward -- reads --> subscriptions
    forward -- writes --> connection
    
    client -- reads --> connection
    client -- writes --> subscriptions
    
    terminate -- drops --> subscriptions
    
    style forward fill:oldlace
    style client fill:oldlace
    style terminate fill:oldlace
    
    style subscriptions fill:pink
    style connection fill:pink
    
  </pre>
<p>If you tried to do this with normal threads, it just plain wouldn’t work…</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">subscriptions</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[];</span><span class="w"> </span><span class="c1">// close enough to a StreamMap for now
</span></span></span><span class="line"><span class="cl"><span class="n">std</span>::<span class="n">thread</span>::<span class="n">scope</span><span class="p">(</span><span class="o">|</span><span class="n">s</span><span class="o">|</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">   </span><span class="n">s</span><span class="p">.</span><span class="n">spawn</span><span class="p">(</span><span class="o">||</span><span class="w"> </span><span class="n">subscriptions</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="s">&#34;key1&#34;</span><span class="p">));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">   </span><span class="n">s</span><span class="p">.</span><span class="n">spawn</span><span class="p">(</span><span class="o">||</span><span class="w"> </span><span class="n">subscriptions</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="s">&#34;key2&#34;</span><span class="p">));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">});</span><span class="w">
</span></span></span></code></pre></div><p>If you <a href="https://play.rust-lang.org/?version=nightly&amp;mode=debug&amp;edition=2021&amp;gist=9737c0cab49437ae45dbef27b80a9619">try this on the playground</a>, you’ll see it gets an error because both closures are trying to access the same mutable state. No good. So how does it work in mini-redis?</p>
<h2 id="enter-select-our-dark-knight">Enter <a href="https://docs.rs/tokio/latest/tokio/macro.select.html"><code>select!</code></a>, our dark knight</h2>
<p>Mini-redis is able to juggle these three threads through careful use of the <a href="https://docs.rs/tokio/latest/tokio/macro.select.html"><code>select!</code></a> macro. This is pretty cool, but also pretty error-prone — as we’ll see, there are a number of subtle points in the way that <a href="https://docs.rs/tokio/latest/tokio/macro.select.html"><code>select!</code></a> is being used here, and it’s easy to write the code wrong and have surprising bugs. At the same time, it’s pretty neat that we can use <a href="https://docs.rs/tokio/latest/tokio/macro.select.html"><code>select!</code></a> in this way, and it begs the question of whether we can find safer patterns to achieve the same thing. I think right now you can find safer ones, but they require less efficiency, which isn’t really living up to Rust’s promise (though it might be a good idea). I’ll cover that in a follow-up post, though, for now I just want to focus on explaining what mini-redis is doing and the pros and cons of this approach.</p>
<p>The main loop looks like this (<a href="https://github.com/tokio-rs/mini-redis/blob/cf1e4e465eceaaddd9497353e809fe6b814d7b19/src/cmd/subscribe.rs#L119-L155">source</a>):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">subscriptions</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">StreamMap</span>::<span class="n">new</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">loop</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="err">…</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="fm">select!</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nb">Some</span><span class="p">((</span><span class="n">channel_name</span><span class="p">,</span><span class="w"> </span><span class="n">msg</span><span class="p">))</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">subscriptions</span><span class="p">.</span><span class="n">next</span><span class="p">()</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//                          -------------------- future 1
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">res</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">dst</span><span class="p">.</span><span class="n">read_frame</span><span class="p">()</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//    ---------------- future 2
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">_</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">shutdown</span><span class="p">.</span><span class="n">recv</span><span class="p">()</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//  --------------- future 3
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p><a href="https://docs.rs/tokio/latest/tokio/macro.select.html"><code>select!</code></a> is kind of like a match statement. It takes multiple futures (underlined in the code above) and continues executing them until one of them completes. Since the <code>select!</code> is in a loop, and in this case each of the features are producing a series of events, this setup effectively runs the three futures concurrently, processing events as they arrive:</p>
<ul>
<li><code>subscriptions.next()</code> &ndash; the future waiting for the next message to arise to the <code>StreamMap</code></li>
<li><code>dst.read_frame()</code> &ndash; the async method <code>read_frame</code> is defined on the conection, <code>dst</code>. It reads data from the client, parses it into a complete command, and returns that command. We&rsquo;ll dive into this function in a bit &ndash; it turns out that it is written in a very careful way to account</li>
<li><code>shutdown.recv()</code> &ndash; the mini-redis server signals a global shutdown by threading a tokio channel to every connection; when a message is sent to that channel, all the loops cleanup and stop.</li>
</ul>
<h2 id="how-select-works">How <a href="https://docs.rs/tokio/latest/tokio/macro.select.html"><code>select!</code></a> works</h2>
<p>So, <a href="https://docs.rs/tokio/latest/tokio/macro.select.html"><code>select!</code></a> runs multiple futures concurrently until one of them completes. In practice, this means that it iterates down the futures, one after the other. Each future gets awoken and runs until it either <em>yields</em> (meaning, awaits on something that isn&rsquo;t ready yet) or <em>completes</em>. If the future yields, then <a href="https://docs.rs/tokio/latest/tokio/macro.select.html"><code>select!</code></a> goes to the next future and tries that one.</p>
<p>Once a future <em>completes</em>, though, the <a href="https://docs.rs/tokio/latest/tokio/macro.select.html"><code>select!</code></a> gets ready to complete. It begins by dropping all the other futures that were selected. This means that they immediately stop executing at whatever <code>await</code> point they reached, running any destructors for things on the stack. <a href="https://smallcultfollowing.com/babysteps/blog/2022/01/27/panics-vs-cancellation-part-1/">As I described in a previous blog post</a>, in practice this feels a lot like a <code>panic!</code> that is injected at the <code>await</code> point. And, just like any other case of recovering from an exception, it requires that code is written carefully to avoid introducing bugs &ndash; <a href="https://tomaka.medium.com/a-look-back-at-asynchronous-rust-d54d63934a1c">tomaka describes one such example</a> in his blog post. These bugs are what gives async cancellation in Rust a reputation for being difficult.</p>
<h2 id="cancellation-and-mini-redis">Cancellation and mini-redis</h2>
<p>Let&rsquo;s talk through what cancellation means for mini-redis. As we saw, the <code>select!</code> here is effectively running two distinct tasks (as well as waiting for shutdown):</p>
<ul>
<li>Waiting on <code>subscriptions.next()</code> for a message to arrive from subscribed channels, so it can be forwarded to the client.</li>
<li>Waiting on <code>dst.read_frame()</code> for the next comand from the client, so that we can modify the set of subscribed channels.</li>
</ul>
<p>We&rsquo;ll see that mini-redis is coded carefully so that, whichever of these events occurs first, everything keeps working correctly. We&rsquo;ll also see that this setup is fragile &ndash; it would be easy to introduce subtle bugs, and the compiler would not help you find them.</p>
<p>Take a look back at the sample subscription workflow at the start of this post. After <code>Client1</code> has subscribed to <code>A</code>, the server is effectively waiting for <code>Client1</code> to send further messages, or for other clients to publish.</p>
<p>The code that checks for further messages from <code>Client1</code> is an async function called <a href="https://github.com/tokio-rs/mini-redis/blob/cf1e4e465eceaaddd9497353e809fe6b814d7b19/src/connection.rs#L56"><code>read_frame</code></a>. It has to read the raw bytes sent by the client and assemble them into a &ldquo;frame&rdquo; (a single command). The <a href="https://github.com/tokio-rs/mini-redis/blob/cf1e4e465eceaaddd9497353e809fe6b814d7b19/src/connection.rs#L56"><code>read_frame</code></a> in mini-redis is written in particular way:</p>
<ul>
<li>It loops and, for each iteration&hellip;
<ul>
<li>tries to parse from a complete frame from <code>self.buffer</code>,</li>
<li>if <code>self.buffer</code> doesn&rsquo;t contain a complete frame, then it reads more data from the stream into the buffer.</li>
</ul>
</li>
</ul>
<p>In pseudocode, it looks like (<a href="https://github.com/tokio-rs/mini-redis/blob/cf1e4e465eceaaddd9497353e809fe6b814d7b19/src/connection.rs#L56-L81">source</a>):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Connection</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">read_frame</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;</span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">Frame</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">loop</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">f</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">parse_frame</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">.</span><span class="n">buffer</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="k">return</span><span class="w"> </span><span class="nb">Ok</span><span class="p">(</span><span class="nb">Some</span><span class="p">(</span><span class="n">f</span><span class="p">));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">read_more_data_into_buffer</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">buffer</span><span class="p">).</span><span class="k">await</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The key idea is that the function buffers up data until it can read an entire frame (i.e., successfully complete) and then it removes that entire frame at once. It never removes <em>part</em> of a frame from the buffer. This ensures that if the <code>read_frame</code> function is canceled while awaiting more data, nothing gets lost.</p>
<h2 id="ways-to-write-a-broken-read_frame">Ways to write a broken <a href="https://github.com/tokio-rs/mini-redis/blob/cf1e4e465eceaaddd9497353e809fe6b814d7b19/src/connection.rs#L56"><code>read_frame</code></a></h2>
<p>There are many ways to a version of <a href="https://github.com/tokio-rs/mini-redis/blob/cf1e4e465eceaaddd9497353e809fe6b814d7b19/src/connection.rs#L56"><code>read_frame</code></a> that is NOT cancel-safe. For example, instead of storing the buffer in <code>self</code>, one could put the buffer on the stack:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Connection</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">read_frame</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;</span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">Frame</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">buffer</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">loop</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">f</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">parse_frame</span><span class="p">(</span><span class="o">&amp;</span><span class="n">buffer</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="k">return</span><span class="w"> </span><span class="nb">Ok</span><span class="p">(</span><span class="nb">Some</span><span class="p">(</span><span class="n">f</span><span class="p">));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">read_more_data_into_buffer</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">buffer</span><span class="p">).</span><span class="k">await</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//                                      -----
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//                If future is canceled here,
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//                buffer is lost.
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This setup is broken because, if the future is canceled when awaiting more data, the buffered data is lost.</p>
<p>Alternatively, <a href="https://github.com/tokio-rs/mini-redis/blob/cf1e4e465eceaaddd9497353e809fe6b814d7b19/src/connection.rs#L56"><code>read_frame</code></a> could intersperse reading from the stream and parsing the frame itself:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Connection</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">read_frame</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;</span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">Frame</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">buffer</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">command_name</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">read_command_name</span><span class="p">().</span><span class="k">await</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">match</span><span class="w"> </span><span class="n">command_name</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="s">&#34;subscribe&#34;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">parse_subscribe_command</span><span class="p">().</span><span class="k">await</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="s">&#34;unsubscribe&#34;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">parse_unsubscribe_command</span><span class="p">().</span><span class="k">await</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="s">&#34;publish&#34;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">parse_publish_command</span><span class="p">().</span><span class="k">await</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The problem here is similar: if we are canceled while awaiting one of the <code>parse_foo_command</code> futures, then we will forget the fact that we read the <code>command_name</code> already.</p>
<h2 id="comparison-with-javascript">Comparison with JavaScript</h2>
<p>It is interesting to compare Rust&rsquo;s <code>Future</code> model with Javascript&rsquo;s <code>Promise</code> model. In JavaScript, when an async function is called, it implicitly creates a new task. This task has &ldquo;independent life&rdquo;, and it keeps executing even if nobody ever awaits it. In Rust, invoking an <code>async fn</code> returns a <code>Future</code>, but that is inert. A <code>Future</code> only executes when some task <em>awaits</em> it. (You can create a task by invoking a suitable <code>spawn</code> method your runtime, and then it will execute on its own.)</p>
<p>There are really good reasons for Rust&rsquo;s model: in particular, it is a zero-cost abstraction (or very close to it). In JavaScript, if you have one async function, and you factor out a helper function, you just went from one task to two tasks, meaning twice as much load on the scheduler. In Rust, if you have an async fn and you factor out a helper, you still have one task; you also still allocate basically the same amount of stack space. This is a good example of the <a href="https://rustacean-principles.netlify.app/how_rust_empowers/performant.html">&ldquo;performant&rdquo;</a> (&ldquo;idiomatic code runs efficiently&rdquo;) Rust design principle in action.</p>
<p><strong>However,</strong> at least as we&rsquo;ve currently set things up, the Rust model does have some sharp edges. We&rsquo;ve seen three ways to write <code>read_frame</code>, and only one of them works. <strong>Interestingly, all three of them would work in JavaScript</strong>, because in the JS model, an async function always starts a task and hence maintains its context.</p>
<p>I would argue that this represents a serious problem for Rust, because it represents a failure to maintain the <a href="https://rustacean-principles.netlify.app/how_rust_empowers/reliable.html">&ldquo;reliability&rdquo;</a> principle (&ldquo;if it compiles, it works&rdquo;), whigh ought to come first and foremost for us. The result is that async Rust feels a bit more like C or C++, where performant and versatile take top rank, and one has to have a lot of experience to know how to avoid sharp edges.</p>
<p>Now, I am not arguing Rust should adopt the &ldquo;Promises&rdquo; model &ndash; I think the Future model is better. But I think we need to tweak <em>something</em> to recover that reliability.</p>
<h2 id="comparison-with-threads">Comparison with threads</h2>
<p>It&rsquo;s interesting to compare how mini-redis with async Rust would compare to a mini-redis implemented with threads. It turns out that it would also be challenging, but in different ways. To start, let&rsquo;s write up some pseudocode for what we are trying to do:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">subscriptions</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">StreamMap</span>::<span class="n">new</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">spawn</span><span class="p">(</span><span class="k">async</span><span class="w"> </span><span class="k">move</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">while</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">((</span><span class="n">channel_name</span><span class="p">,</span><span class="w"> </span><span class="n">msg</span><span class="p">))</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">subscriptions</span><span class="p">.</span><span class="n">next</span><span class="p">().</span><span class="k">await</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">connection</span><span class="p">.</span><span class="n">send_message</span><span class="p">(</span><span class="n">channel_name</span><span class="p">,</span><span class="w"> </span><span class="n">msg</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">spawn</span><span class="p">(</span><span class="k">async</span><span class="w"> </span><span class="k">move</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">while</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">frame</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">connection</span><span class="p">.</span><span class="n">read_frame</span><span class="p">().</span><span class="k">await</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">match</span><span class="w"> </span><span class="n">frame</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">Subscribe</span><span class="p">(</span><span class="n">new_channel</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">subscribe</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">connection</span><span class="p">,</span><span class="w"> </span><span class="n">new_channel</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">Unsubscribe</span><span class="p">(</span><span class="n">channel</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">unsubscribe</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">connection</span><span class="p">,</span><span class="w"> </span><span class="n">channel</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">_</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="o">..</span><span class="p">.,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">});</span><span class="w">
</span></span></span></code></pre></div><p>Here we have spawned out two threads, one of which is waiting for new messages from the <code>subscriptions</code>, and one of which is processing incoming client messages (which may involve adding channels the <code>subscriptions</code> map).</p>
<p>There are two problems here. First, you may have noticed I didn&rsquo;t handle server shutdown! That turns out to be kind of a pain in this setup, because tearing down those spawns tasks is harder than you might think. For simplicity, I&rsquo;m going to skip that for the rest of the post &ndash; it turns out that <a href="https://github.com/nikomatsakis/moro/">moro</a>&rsquo;s APIs solve this problem in a really nice way by allowing shutdown to be imposed externally without any deep changes.</p>
<p>Second, those two threads are both accessing <code>subscriptions</code> and <code>connection</code> in a mutable way, which the Rust compiler will not accept. <strong>This is a key problem.</strong> Rust&rsquo;s type system works really well when you can breakdown your data such that every task accesses distinct data (i.e., &ldquo;spatially disjoint&rdquo;), either because each task owns the data or because they have <code>&amp;mut</code> references to different parts of it. We have a much harder time dealing with multiple tasks accessing the <em>same data</em> but at <em>different points in time</em> (i.e., &ldquo;temporally disjoint&rdquo;).</p>
<h2 id="use-an-arc-mutex">Use an arc-mutex?</h2>
<p>The main way to manage multiple tasks sharing access to the same data is with some kind of interior mutability, typically an <code>Arc&lt;Mutex&lt;T&gt;&gt;</code>. One problem with this is that it fails Rust&rsquo;s <a href="https://rustacean-principles.netlify.app/how_rust_empowers/performant.html"><em>performant</em></a> design principle (&ldquo;idiomatic code runs efficiently&rdquo;), because there is runtime overhead (even if it is minimal in practice, it doesn&rsquo;t feel good). Another problem with <code>Arc&lt;Mutex&lt;T&gt;&gt;</code> is that it hits on a lot of Rust&rsquo;s ergonomic weak points, failing our <a href="https://rustacean-principles.netlify.app/how_rust_empowers/supportive.html">&ldquo;supportive&rdquo;</a> principle (&ldquo;the language, tools, and community are here to help&rdquo;):</p>
<ul>
<li>You have to allocate the arcs and clone references explicitly, which is annoying;</li>
<li>You have to invoke methods like <code>lock</code>, get back lock guards, and understand how destructors and lock guards interact;</li>
<li>In Async code in particular, thanks to <a href="https://github.com/rust-lang/rust/issues/57478">#57478</a>, the compiler doesn&rsquo;t understand very well when a lock guard has been dropped, resulting in annoying compiler errors &ndash; though <a href="https://github.com/eholk/">Eric Holk</a> is close to landing a fix for this one! &#x1f389;</li>
</ul>
<p>Of course, people who remember the &ldquo;bad old days&rdquo; of async Rust before async-await are very familiar with this dynamic. In fact, one of the big selling points of adding async await sugar into Rust was <a href="http://aturon.github.io/tech/2018/04/24/async-borrowing/">getting rid of the need to use arc-mutex</a>.</p>
<h2 id="deeper-problems">Deeper problems</h2>
<p>But the ergonomic pitfalls of <code>Arc&lt;Mutex&gt;</code> are only the beginning. It&rsquo;s also just really hard to get <code>Arc&lt;Mutex&gt;</code> to actually work for this setup. To see what I mean, let&rsquo;s dive a bit deeper into the state for mini-redis. There are two main bits of state we have to think about:</p>
<ul>
<li>the tcp-stream to the client</li>
<li>the <code>StreamMap</code> of active connections</li>
</ul>
<p>Managing access to the tcp-stream for the client is actually relatively easy. For one thing, tokio streams support a <a href="https://docs.rs/tokio/latest/tokio/io/fn.split.html"><code>split</code></a> operation, so it is possible to take the stream and split out the &ldquo;sending half&rdquo; (for sending messages to the client) and the &ldquo;receiving half&rdquo; (for receiving messages from the client). All the active threads can send data to the client, so they all need the sending half, and presumably it&rsquo;ll be have to be wrapped in an (async aware) mutex. But only one active thread needs the receiving half, so it can own that, and avoid any locks.</p>
<p>Managing access to the <code>StreamMap</code> of active connections, though, is quite a bit more difficult. Imagine we were to put that <code>StreamMap</code> itself into a <code>Arc&lt;Mutex&gt;</code>, so that both tasks can access it. Now one of the tasks is going to be waiting for new messages to arrive. It&rsquo;s going to look something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">subscriptions</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Arc</span>::<span class="n">new</span><span class="p">(</span><span class="n">Mutex</span>::<span class="n">new</span><span class="p">(</span><span class="n">StreamMap</span>::<span class="n">new</span><span class="p">()));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">spawn</span><span class="p">(</span><span class="k">async</span><span class="w"> </span><span class="k">move</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">while</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">((</span><span class="n">channel_name</span><span class="p">,</span><span class="w"> </span><span class="n">msg</span><span class="p">))</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">subscriptions</span><span class="p">.</span><span class="n">lock</span><span class="p">().</span><span class="n">unwrap</span><span class="p">().</span><span class="n">next</span><span class="p">().</span><span class="k">await</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">connection</span><span class="p">.</span><span class="n">send_message</span><span class="p">(</span><span class="n">channel_name</span><span class="p">,</span><span class="w"> </span><span class="n">msg</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">});</span><span class="w">
</span></span></span></code></pre></div><p>However, this code won&rsquo;t compile (thankfully!). The problem is that we are acquiring a lock but we are trying to hold onto that lock while we <code>await</code>, which means we might switch to other tasks with the lock being held. This can easily lead to deadlock if those other tasks try to acquire the lock, since the tokio scheduler and the O/S scheduler are not cooprerating with one another.</p>
<p>An alternative would be to use an async-aware mutex like <a href="https://docs.rs/tokio/latest/tokio/sync/struct.Mutex.html">tokio::sync::Mutex</a>, but that is also not great: we can still wind up with a deadlock, but for another reason. The server is now prevented from adding a new subscription to the list until the lock is released, which means that if Client1 is trying to subscribe to a new channel, it has to wait for some other client to send a message to an existing channel to do so (because that is when the lock is released). Not great.</p>
<p>Actually, this whole saga is covered under another async vision doc &ldquo;status quo&rdquo; story, <a href="https://rust-lang.github.io/wg-async/vision/submitted_stories/status_quo/alan_thinks_he_needs_async_locks.html">Alan thinks he needs async locks</a>.</p>
<h2 id="a-third-alternative-actors">A third alternative: actors</h2>
<p>Recognizing the problems with locks, Alice Ryhl some time ago wrote a nice blog post, <a href="https://ryhl.io/blog/actors-with-tokio/">&ldquo;Actors with Tokio&rdquo;</a>, that explains how to setup actors. This problem actually helps to address both our problems around mutable state. The idea is to move the connections array so that it belongs solely to one actor. Instead of directly modifying <code>collections</code>, the other tasks will communicate with this actor by exchanging messages.</p>
<p>So basically there could be two actors, or even three:</p>
<ul>
<li>Actor A, which owns the <code>connections</code> (list of subscribed streams). It receives messages that are either publishing new messages to the streams or messages that say &ldquo;add this stream&rdquo; to the list.</li>
<li>Actor B, which owns the &ldquo;read half&rdquo; of the client&rsquo;s TCP stream. It reads bytes and parses new frames, then sends out requests to the other actors in response. For example, when a subscribe message comes in, it can send a message to Actor A saying &ldquo;subscribe the client to this channel&rdquo;.</li>
<li>Actor C, which owns the &ldquo;write half&rdquo; of the client&rsquo;s TCP stream. Both actors A and B will send messages to it when there are things to be sent to client.</li>
</ul>
<p>To see how this would be implemented, take a look at <a href="https://ryhl.io/blog/actors-with-tokio/">Alice&rsquo;s post</a>. The TL;DR is that you would model connections between actors as tokio channels. Each actor is either spawned or otherwise setup to run independently. You still wind up using <code>select!</code>, but you only use it to receive messages from multiple channels at once. This doesn&rsquo;t present any cancelation hazards because the channel code is carefully written to avoid them.</p>
<p>This setup works fine, and is even elegant in its own way, but it&rsquo;s also not living up to Rust&rsquo;s concept of <a href="https://rustacean-principles.netlify.app/how_rust_empowers/performant.html">performant</a> or the goal of &ldquo;zero-cost abstractions&rdquo; (ZCA). In particular, the idea with ZCA is that it is supposed to give you a model that says &ldquo;if you wrote this by hand, you couldn&rsquo;t do any better&rdquo;. But if you wrote a mini-redis server in C, by hand, you probably wouldn&rsquo;t adopt actors. In some sense, this is just adopting something much closer to the <code>Promise</code> model. (Plus, the most obvious way to implement actors in tokio is largely to use <code>tokio::spawn</code>, which definitely adds overhead, or to use <code>FuturesUnordered</code>, which can be a bit subtle as well &ndash; <a href="https://github.com/nikomatsakis/moro/">moro</a> does address these problems by adding a nice API here.)</p>
<p>(The other challenge with actors implemented this way is coordinating shutdown, though it can certainly be done: you just have to remember to thread the shutdown handler around everywhere.)</p>
<h2 id="cancellation-as-the-dark-knight-looking-again-at-select">Cancellation as the &ldquo;dark knight&rdquo;: looking again at <code>select!</code></h2>
<p>Taking a step back, we&rsquo;ve now seen that trying to use distinct tasks introduces this interesting problem that we have shared data being accessed by all the tasks. That either pushes us to locks (broken) or actors (works), but either way, it raises the question: <strong>why wasn&rsquo;t this a problem with <a href="https://docs.rs/tokio/latest/tokio/macro.select.html"><code>select!</code></a>?</strong> After all, <a href="https://docs.rs/tokio/latest/tokio/macro.select.html"><code>select!</code></a> is still combining various logical tasks, and those tasks are still touching the same variables, so why is the compiler ok with it?</p>
<p>The answer is closely tied to cancellation: the <a href="https://docs.rs/tokio/latest/tokio/macro.select.html"><code>select!</code></a> setup works because</p>
<ul>
<li>the things running concurrently are not touching overlapping state:
<ul>
<li>one of them is looking at <code>subscriptions</code> (waiting for a message);</li>
<li>another is looking at <code>connection</code>;</li>
<li>and the last one is receiving the termination message.</li>
</ul>
</li>
<li>and once we decide which one of these paths to take, <strong>we cancel all the others</strong>.</li>
</ul>
<p>This last part is key: if we receive an incoming message from the client, for example, we drop the future that was looking at <code>subscriptions</code>, canceling it. That means <code>subscriptions</code> is no longer in use, so we can push new subscriptions into it, or remove things from it.</p>
<p>So, cancellation is both what enables the mini-redis example to be performant and a zero-cost abstraction, but it is <strong>also</strong> the cause of our reliability hazards. That&rsquo;s a pickle!</p>
<h2 id="conclusions">Conclusions</h2>
<p>We&rsquo;ve seen a lot of information, so let me try to sum it all up for you:</p>
<ul>
<li>Fine-grained cancellation in <code>select!</code> is what enables async Rust to be a zero-cost abstraction and to avoid the need to create either locks or actors all over the place.</li>
<li>Fine-grained cancellation in <code>select</code> is the root cause for a LOT of reliability problems.</li>
</ul>
<p>You&rsquo;ll note that I wrote <em>fine-grained</em> cancellation. What I mean by that is specifically things like how <code>select!</code> will cancel the other futures. This is very different from <em>coarse-grained</em> cancellation like having the entire server shutdown, for which I think structured concurrency solves the problem very well.</p>
<p>So what can we do about fine-grained cancellation? Well, the answer depends.</p>
<p>In the short term, I value reliability above all, so I think adopting an actor-like pattern is a good idea. This setup can be a nice architecture for a lot of reasons<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>, and while I&rsquo;ve described it as &ldquo;not performant&rdquo;, that assumes you are running a really high-scale server that has to handle a ton of load. For most applications, it will perform very well indeed.</p>
<p>I think it makes sense to be very judiciouis in what you <a href="https://docs.rs/tokio/latest/tokio/macro.select.html"><code>select!</code></a>! In the context of Materialize, <a href="https://github.com/guswynn/">GusWynn</a> was <a href="https://github.com/MaterializeInc/materialize/pull/12796/">experimenting with a <code>Selectable</code> trait</a> for precisely this reason; that trait just permits select from a few sources, like channels. It&rsquo;d be nice to support some convenient way of declaring that an <code>async fn</code> is cancel-safe, e.g. only allowing it to be used in <code>select!</code> if it is tagged with <code>#[cancel_safe]</code>. (This might be something one could author as a proc macro.)</p>
<p>But in the longer term, I&rsquo;m interested if we can come up with a mechanism that will allow the compiler to <em>get smarter</em>. For example, I think it&rsquo;d be cool if we could share one <code>&amp;mut</code> across two <code>async fn</code> that are running concurrently, so long as that <code>&amp;mut</code> is not borrowed across an <code>await</code> point. I have thoughts on that but&hellip;not for this post.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>My experience is that being forced to get a clear picture on this is part of what makes Rust code reliable in practice.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>It&rsquo;d be fun to take a look at <a href="https://www.manning.com/books/reactive-design-patterns">Reactive Design Patterns</a> and examine how many of them apply to Rust. I enjoyed that book a lot.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><title type="html">Coherence and crate-level where-clauses</title><link href="https://smallcultfollowing.com/babysteps/blog/2022/04/17/coherence-and-crate-level-where-clauses/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2022/04/17/coherence-and-crate-level-where-clauses/</id><published>2022-04-17T00:00:00+00:00</published><updated>2022-04-17T12:31:00-04:00</updated><content type="html"><![CDATA[<p>Rust has been wrestling with coherence more-or-less since we added methods; our current rule, the “orphan rule”, is safe but overly strict. Roughly speaking, the rule says that one can only implement foreign traits (that is, traits defined by one of your dependencies) for local types (that is, types that you define). The goal of this rule was to help foster the crates.io ecosystem — we wanted to ensure that you could grab any two crates and use them together, without worrying that they might define incompatible impls that can’t be combined. The rule has served us well in that respect, but over time we’ve seen that it can also have a kind of chilling effect, unintentionally working <strong>against</strong> successful composition of crates in the ecosystem. For this reason, I’ve come to believe that we will have to weaken the orphan rule. The purpose of this post is to write out some preliminary exploration of ways that we might do that.</p>
<h2 id="so-wait-how-does-the-orphan-rule-protect-composition">So wait, how does the orphan rule protect composition?</h2>
<p>You might be wondering how the orphan rule ensures you can compose crates from crates.io. Well, imagine that there is a crate <code>widget</code> that defines a struct <code>Widget</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// crate widget
</span></span></span><span class="line"><span class="cl"><span class="cp">#[derive(PartialEq, Eq)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Widget</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="n">name</span>: <span class="nb">String</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="n">code</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>As you can see, the crate has derived <code>Eq</code>, but neglected to derive <code>Hash</code>. Now, I am writing another crate, <code>widget-factory</code> that depends on <code>widget</code>. I’d like to store widgets in a hashset, but I can’t,  because they don’t implement <code>Hash</code>! Today, if you want <code>Widget</code> to implement <code>Hash</code>, the only way is to open a PR against <code>widget</code> and wait for a new release.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> But if we didn’t have the orphan rule, we could just define <code>Hash</code> ourselves:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// Crate widget-factory
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Hash</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Widget</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">hash</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// PSA: Don’t really define your hash functions like this omg.
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">name</span><span class="p">.</span><span class="n">hash</span><span class="p">()</span><span class="w"> </span><span class="o">^</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">code</span><span class="p">.</span><span class="n">hash</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now we can define our <code>WidgetFactory</code> using <code>HashSet&lt;Widget&gt;</code>…</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">WidgetFactory</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">produced</span>: <span class="nc">HashSet</span><span class="o">&lt;</span><span class="n">Widget</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">WidgetFactory</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">take_produced</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">HashSet</span><span class="o">&lt;</span><span class="n">Widget</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">produced</span><span class="p">.</span><span class="n">take</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>OK, so far so good, but what happens if somebody else defines a <code>widget-delivery</code> crate and they too wish to use a <code>HashSet&lt;Widget&gt;</code>? Well, they will also define <code>Hash</code> for <code>Widget</code>, but of course they might do it differently — maybe even very badly:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// Crate widget-factory
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Hash</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Widget</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">hash</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// PSA: You REALLY shouldn’t define your hash functions this way omg
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="mi">0</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now the problem comes when I try to develop my <code>widget-app</code> crate that depends on <code>widget-delivery</code> <em>and</em> <code>widget-factory</code>. I now have two different impls of <code>Hash</code> for <code>Widget</code>, so which should the compiler use?</p>
<p>There are a bunch of answers we might give here, but most of them are bad:</p>
<ul>
<li>We could have each crate use its own impl, in theory: but that wouldn’t work so well if the user tried to take a <code>HashSet&lt;Widget&gt;</code> from one crate and pass it to another crate.</li>
<li>The compiler could pick one of the two impls arbitrarily, but how do we know which one to use? In this case, one of them would give very bad performance, but it’s also possible that some code is designed to expect the exact hash algorithm it specified.
<ul>
<li>This is even harder with associated types.</li>
</ul>
</li>
<li>Users could tell us which impl they want, which is maybe better, but it also means that the <code>widget-delivery</code> crates have to be prepared that any impl they are using might be switched to another one by some other crate later on. This makes it impossible for us to inline the hash function or do other optimizations except at the very last second.</li>
</ul>
<p>Faced with these options, we decided to just rule out orphan impls altogether. Too much hassle!</p>
<h2 id="but-the-orphan-rules-make-it-hard-to-establish-a-standard">But the orphan rules make it hard to establish a standard</h2>
<p>The orphan rules work well at ensuring that we can link two crates together, but ironically they can also work to make <em>actual interop</em> much harder. Consider the async runtime situation. Right now, there are a number of async runtimes, but no convenient way to write code that works with <em>any</em> runtime. As a result, people writing async libraries often wind up writing directly against one <em>specific</em> runtime. The end result is that we cannot combine libraries that were written against different runtimes, or at least that doing so can result in surprising failures.</p>
<p>It would be nice if we could implement some traits that allowed for greater interop. But we don’t quite know what those traits should look like (we also lack support for async fn in traits, but that’s coming!), so it would be nice if we could introduce those traits in the crates.io ecosystem and iterate a bit there — this was indeed the original vision for the futures crate! But if we do that, in practice, then the same crate that defines the trait must <em>also</em> define an implementation for every runtime. The problem is that the runtimes won’t want to depend on the futures crate, as it is still unstable; and the futures crate doesn’t want to have to depend on every runtime. So we’re kind of stuck. And of course if the <code>futures</code> crate were to take a dependency on some specific runtime, then that runtime couldn’t later add <code>futures</code> as a dependency, since that would result in a cycle.</p>
<h2 id="distinguishing-i-need-an-impl-from-i-prove-an-impl">Distinguishing “I need an impl” from “I prove an impl”</h2>
<p>At the end of the day, I think we’re going to have to lift the orphan rule, and just accept that it may be possible to create crates that cannot be linked together because they contain overlapping impls. However, we can still give people the tools to ensure that composition works smoothly.</p>
<p>I would like to see us distinguish (at least) two cases:</p>
<ul>
<li>I need this type to implement this trait (which maybe it doesn’t, yet).</li>
<li>I am supplying an impl of a trait for a given type.</li>
</ul>
<p>The idea would be that most crates can just declare <em>that they need an impl</em> without actually supplying a specific one. Any number of such crates can be combined together without a problem (assuming that they don’t put inconsistent conditions on associated types).</p>
<p>Then, separately, one can have a crate that actually <em>supplies</em> an impl of a foreign trait for a foreign type. These impls can be isolated as much as possible. The hope is that only the final binary would be responsible for actually supplying the impl itself.</p>
<h2 id="where-clauses-are-how-we-express-i-need-an-impl-today">Where clauses are how we express “I need an impl” today</h2>
<p>If you think about it, expressing “I need an impl” is something that we do all the time, but we typically do it with generic types. For example, when I write a function like so…</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">clone_list</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Clone</span><span class="o">&gt;</span><span class="p">(</span><span class="n">v</span>: <span class="kp">&amp;</span><span class="p">[</span><span class="n">T</span><span class="p">])</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="err">…</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>I am saying “I need a type <code>T</code> and I need it to implement <code>Clone</code>”, but I’m not being specific about what those types are.</p>
<p>In fact, it’s also possible to use where-clauses to specify things about non-generic types…</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">example</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kt">u32</span>: <span class="nb">Copy</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span></code></pre></div><p>…but the compiler today is a bit inconsistent about how it treats those. The plan is to move to a model where we “trust” what the user wrote — e.g., if the user wrote <code>where String: Copy</code>, then the function would treat the <code>String</code> type as if it were <code>Copy</code>, even if we can’t find any <code>Copy</code> impl. It so happens that such a function could never be <em>called</em>, but that’s no reason you can’t <em>define</em> it<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>.</p>
<h2 id="where-clauses-at-the-crate-scope">Where clauses at the crate scope</h2>
<p>What if we could put where clauses at the crate scope? We could use that to express impls that we need to exist without actually providing those impls. For example, the <code>widget-factory</code> crate from our earlier example might add a line like this into its lib.rs:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// Crate widget-factory
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w"> </span><span class="n">Widget</span>: <span class="nc">Hash</span><span class="p">;</span><span class="w">
</span></span></span></code></pre></div><p>As a result, people would not be able to use that crate unless they either (a) supplied an impl of <code>Hash</code> for <code>Widget</code> or (b) repeated the where clause themselves, propagating the request up to the crates that depend on them. (Same as with any other where-clause.)</p>
<p>The intent would be to do the latter, propagating the dependencies up to the root crate, which could then either supply the impl itself or link in some other crate that does.</p>
<h2 id="allow-crates-to-implement-foreign-traits-for-foreign-impls">Allow crates to implement foreign traits for foreign impls</h2>
<p>The next part of the idea would be to allow crates to implement foreign traits for foreign impls. I think I would convert the orphan check into a “deny by default” lint. The lint text would explain that these impls are not permitted because they may cause linker errors, but a crate could mark the impl with <code>#[allow(orphan_impls])</code> to ignore that warning. Best practice would be to put orphan impls into their own crate that others can use.</p>
<h2 id="another-idea-permit-duplicate-impls-especially-those-generated-via-derive">Another idea: permit duplicate impls (especially those generated via derive)</h2>
<p>Josh Triplett floated another interesting idea, which is that we could permit duplicate impls. One common example might be if the impl is defined via a derive (though we’d have to extend derive to permit one to derive on a struct definition that is not local somehow).</p>
<h2 id="conflicting-where-clauses">Conflicting where clauses</h2>
<p>Even if you don’t supply an actual impl, it’s possible to create two crates that can’t be linked together if they contain contradictory where-clauses. For example, perhaps <code>widget-factory</code> defines <code>Widget</code> as an iterator over strings…</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// Widget-factory
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w"> </span><span class="n">Widget</span>: <span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">String</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span></code></pre></div><p>…whilst <code>widget-lib</code> wants <code>Widget</code> to be an iterator over UUIDs:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// Widget-lib
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w"> </span><span class="n">Widget</span>: <span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="no">UUID</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span></code></pre></div><p>At the end of the day, at most one of these where-clauses can be satisfied, not both, so the two crates would not interoperate. That seems inevitable and ok.</p>
<h2 id="expressing-target-dependencies-via-where-clauses">Expressing target dependencies via where-clauses</h2>
<p>Another idea that has been kicking around is the idea of expressing portability across target-architectures via traits and some kind of <code>Platform</code> type. As an example, one could imagine having code that says <code>where Platform: NativeSimd</code> to mean “this code requires native SIMD support”, or perhaps <code>where Platform: Windows</code> to mean “this msut support various windows APIs. This is just a “kernel” of an idea, I have no idea what the real trait hierarchy would look like, but it’s quite appealing and seems to fit well with the idea of crate-level where-clauses. Essentially the idea is to allow crates to “constrain the environment that they are used in” in an explicit way.</p>
<h2 id="module-level-generics">Module-level generics</h2>
<p>In truth, the idea of crate-level where clauses is kind of a special case of having module-level generics, which I would very much like. The idea would be to allow modules (like types, functions, etc) to declare generic parameters and where-clauses.<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup> These would be nameable and usable from all code within the module, and when you referenced an item from <em>outside</em> the module, you would have to specify their value. This is very much like how a trait-level generic gets “inherited” by the methods in the trait.</p>
<p>I have wanted this for a long time because I often have modules where all the code is parameterized over some sort of “context parameter”. In the compiler, that is the lifetime <code>’tcx</code>, but very often it’s some kind of generic type (e.g., <code>Interner</code> in salsa).</p>
<h2 id="conclusion">Conclusion</h2>
<p>I discussed a few things in this post:</p>
<ul>
<li>How coherence helps composability by ensuring that crates can be linked together, but harms composability by making it much harder to establish and use interoperability traits.</li>
<li>How crate-level where-clauses can allow us to express “I need someone to implement this trait” without actually providing an impl, providing for the ability to link things together.</li>
<li>A sketch of how crate-level where-clauses might be generalized to capture other kinds of constraints on the environment, such as conditions on the target platform, or to module-level generics, which could potentially be an ergonomic win.</li>
</ul>
<p>Overall, I feel pretty excited about this direction. I feel like more and more things are becoming possible if we think about generalizing the trait system and making it more uniform. All of this, in my mind, builds on the work we’ve been doing to create a more precise definition of the trait system in <a href="https://github.com/nikomatsakis/a-mir-formality">a-mir-formality</a> and to build up a team with expertise in how it works (see the <a href="https://github.com/rust-lang/rfcs/pull/3254">types team RFC</a>). I’ll write more about those in upcoming posts though! =)</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>You could also create a newtype and making your hashmap key off the newtype, but that’s more of a workaround, and doesn’t always work out.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>It might be nice of us to give a warning.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Fans of ML will recognize this as “applicative functors”.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><title type="html">Implied bounds and perfect derive</title><link href="https://smallcultfollowing.com/babysteps/blog/2022/04/12/implied-bounds-and-perfect-derive/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2022/04/12/implied-bounds-and-perfect-derive/</id><published>2022-04-12T00:00:00+00:00</published><updated>2022-04-12T17:48:00-04:00</updated><content type="html"><![CDATA[<p>There are two ergonomic features that have been discussed for quite some time in Rust land: <em>perfect derive</em> and <em>expanded implied bounds</em>. Until recently, we were a bit stuck on the best way to implement them. Recently though I’ve been working on a new formulation of the Rust trait checker that gives us a bunch of new capabilities — among them, it resolved a soundness formulation that would have prevented these two features from being combined. I’m not going to describe my fix in detail in this post, though; instead, I want to ask a different question. Now that we <em>can</em> implement these features, should we?</p>
<p>Both of these features fit nicely into the <em>less rigamarole</em> part of the <a href="https://blog.rust-lang.org/inside-rust/2022/04/04/lang-roadmap-2024.html">lang team Rust 2024 roadmap</a>. That is, they allow the compiler to be smarter and require less annotation from you to figure out what code should be legal. Interestingly, as a direct result of that, they both <em>also</em> carry the same downside: semver hazards.</p>
<h2 id="what-is-a-semver-hazard">What is a semver hazard?</h2>
<p>A <strong>semver hazard</strong> occurs when you have a change which <em>feels</em> innocuous but which, in fact, can break clients of your library. Whenever you try to automatically figure out some part of a crate’s public interface, you risk some kind of semver hazard. This doesn’t necessarily mean that you shouldn’t do the auto-detection: the convenience may be worth it. But it’s usually worth asking yourself if there is some way to lessen the semver hazard while still getting similar or the same benefits.</p>
<p>Rust has a number of semver hazards today.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> The most common example is around thread-safety. In Rust, a struct <code>MyStruct</code> is automatically deemed to implement the trait <code>Send</code> so long as all the fields of <code>MyStruct</code> are <code>Send</code> (this is why we call <code>Send</code> an <a href="https://doc.rust-lang.org/reference/special-types-and-traits.html#auto-traits">auto trait</a>: it is <em>automatically</em> implemented). This is very convenient, but an implication of it is that adding a private field to your struct whose type is not thread-safe (e.g., a <code>Rc&lt;T&gt;</code>) is potentially a breaking change: if someone was using your library and sending <code>MyStruct</code> to run in another thread, they would no longer be able to do so.</p>
<h2 id="what-is-perfect-derive">What is “perfect derive”?</h2>
<p>So what is the <em>perfect derive</em> feature? Currently, when you derive a trait (e.g., <code>Clone</code>) on a generic type, the derive just assumes that <em>all</em> the generic parameters must be <code>Clone</code>. This is sometimes necessary, but not always; the idea of <em>perfect derive</em> is to change how derive works so that it instead figures out <em>exactly</em> the bounds that are needed.</p>
<p>Let’s see an example. Consider this <code>List&lt;T&gt;</code> type, which creates a linked list of <code>T</code> elements. Suppose that <code>List&lt;T&gt;</code> can be deref’d to yield its <code>&amp;T</code> value. However, lists are immutable once created, and we also want them to be cheaply cloneable, so we use <code>Rc&lt;T&gt;</code> to store the data itself:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cp">#[derive(Clone)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">List</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span>: <span class="nc">Rc</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">next</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="n">Rc</span><span class="o">&lt;</span><span class="n">List</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;&gt;&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Deref</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">List</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Target</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">T</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">deref</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kp">&amp;</span><span class="nc">T</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">&amp;</span><span class="bp">self</span><span class="p">.</span><span class="n">data</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Currently, derive is going to generate an impl that requires <code>T: Clone</code>, like this…</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Clone</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">List</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">T</span>: <span class="nb">Clone</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">clone</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">List</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">value</span>: <span class="nc">self</span><span class="p">.</span><span class="n">value</span><span class="p">.</span><span class="n">clone</span><span class="p">(),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">next</span>: <span class="nc">self</span><span class="p">.</span><span class="n">next</span><span class="p">.</span><span class="n">clone</span><span class="p">(),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>If you look closely at this impl, though, you will see that the <code>T: Clone</code> requirement is not actually necessary. This is because the only <code>T</code> in this struct is inside of an <code>Rc</code>, and hence is reference counted. Cloning the <code>Rc</code> only increments the reference count, it doesn’t actually create a new <code>T</code>.</p>
<p>With <em>perfect derive</em>, we would change the derive to generate an impl with one where clause per field, instead. The idea is that what we <em>really</em> need to know is that every field is cloneable (which may in turn require that <code>T</code> be cloneable):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Clone</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">List</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Rc</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span>: <span class="nb">Clone</span><span class="p">,</span><span class="w"> </span><span class="c1">// type of the `value` field
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">Rc</span><span class="o">&lt;</span><span class="n">List</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;&gt;</span>: <span class="nb">Clone</span><span class="p">,</span><span class="w"> </span><span class="c1">// type of the `next` field
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">clone</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="cm">/* as before */</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="making-perfect-derive-sound-was-tricky-but-we-can-do-it-now">Making perfect derive sound was tricky, but we can do it now</h2>
<p>This idea is quite old, but there were a few problems that have blocked us from doing it. First, it requires changing all trait matching to permit cycles (currently, cycles are only permitted for auto traits like <code>Send</code>). This is because checking whether <code>List&lt;T&gt;</code> is <code>Send</code> would not require checking whether <code>Option&lt;Rc&lt;List&lt;T&gt;&gt;&gt;</code> is <code>Send</code>. If you work that through, you’ll find that a cycle arises. I’m not going to talk much about this in this post, but it is not a trivial thing to do: if we are not careful, it would make Rust quite unsound indeed. For now, though, let’s just assume we can do it soundly.</p>
<h2 id="the-semver-hazard-with-perfect-derive">The semver hazard with perfect derive</h2>
<p>The other problem is that it introduces a new semver hazard: just as Rust currently commits you to being <code>Send</code> so long as you don’t have any non-<code>Send</code> types, <code>derive</code> would now commit <code>List&lt;T&gt;</code> to being cloneable even when <code>T: Clone</code> does not hold.</p>
<p>For example, perhaps we decide that storing a <code>Rc&lt;T&gt;</code> for each list wasn’t really necessary. Therefore, we might refactor <code>List&lt;T&gt;</code> to store <code>T</code> directly, like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cp">#[derive(Clone)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">List</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span>: <span class="nc">T</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">next</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="n">Rc</span><span class="o">&lt;</span><span class="n">List</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;&gt;&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>We might expect that, since we are only changing the type of a private field, this change could not cause any clients of the library to stop compiling. <strong>With perfect derive, we would be wrong.</strong><sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> This change means that we now own a <code>T</code> directly, and so <code>List&lt;T&gt;: Clone</code> is only true if <code>T: Clone</code>.</p>
<h2 id="expanded-implied-bounds">Expanded implied bounds</h2>
<p>An <em>implied bound</em> is a where clause that you don’t have to write explicitly. For example, if you have a struct that declares <code>T: Ord</code>, like this one…</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">RedBlackTree</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Ord</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="err">…</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Ord</span><span class="o">&gt;</span><span class="w"> </span><span class="n">RedBlackTree</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">insert</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">value</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="err">…</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>…it would be nice if functions that worked with a red-black tree didn’t have to redeclare those same bounds:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">insert_smaller</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="n">red_black_tree</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">RedBlackTree</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">,</span><span class="w"> </span><span class="n">item1</span>: <span class="nc">T</span><span class="p">,</span><span class="w"> </span><span class="n">item2</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Today, this function would require `where T: Ord`:
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">if</span><span class="w"> </span><span class="n">item1</span><span class="w"> </span><span class="o">&lt;</span><span class="w"> </span><span class="n">item2</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">red_black_tree</span><span class="p">.</span><span class="n">insert</span><span class="p">(</span><span class="n">item</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">red_black_tree</span><span class="p">.</span><span class="n">insert</span><span class="p">(</span><span class="n">item2</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">   
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="err">\</span><span class="w">
</span></span></span></code></pre></div><p>I am saying <em>expanded</em> implied bounds because Rust already has two notions of implied bounds: expanding supertraits (<code>T: Ord</code> implies <code>T: PartialOrd</code>, for example, which is why the fn above can contain <code>item1 &lt; item2</code>) and outlives relations (an argument of type <code>&amp;’a T</code>, for example, implies that <code>T: ‘a</code>). The most maximal version of this proposal would expand those implied bounds from supertraits and lifetimes to <strong>any where-clause at all</strong>.</p>
<h2 id="implied-bounds-and-semver">Implied bounds and semver</h2>
<p>Expanding the set of implied bounds will also introduce a new semver hazard — or perhaps it would be better to say that is expands an existing semver hazard. It’s already the case that removing a supertrait from a trait is a breaking change: if the stdlib were to change <code>trait Ord</code> so that it no longer extended <code>Eq</code>, then Rust programs that just wrote <code>T: Ord</code> would no longer be able to assume that <code>T: Eq</code>, for example.</p>
<p>Similarly, at least with a maximal version of expanded implied bounds, removing the <code>T: Ord</code> from <code>BinaryTree&lt;T&gt;</code> would potentially stop client code from compiling. Making changes like that is not that uncommon. For example, we might want to introduce new methods on <code>BinaryTree</code> that work even without ordering. To do that, we would remove the <code>T: Ord</code> bound from the struct and just keep it on the impl:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">RedBlackTree</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="err">…</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">RedBlackTree</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">len</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">usize</span> <span class="p">{</span><span class="w"> </span><span class="cm">/* doesn’t need to compare `T` values, so no bound */</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Ord</span><span class="o">&gt;</span><span class="w"> </span><span class="n">RedBlackTree</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">insert</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">value</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="err">…</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>But, if we had a maximal expansion of implied bounds, this could cause crates that depend on your library to stop compiling, because they would no longer be able to assume that <code>RedBlackTree&lt;X&gt;</code> being valid implies <code>X: Ord</code>. As a general rule, I think we want it to be clear what parts of your interface you are committing to and which you are not.</p>
<h2 id="psa-removing-bounds-not-always-semver-compliant">PSA: Removing bounds not always semver compliant</h2>
<p>Interestingly, while it is true that you can remove bounds from a struct (today, at least) and be at semver complaint<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>, this is not the case for impls. For example if I have</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Copy</span><span class="o">&gt;</span><span class="w"> </span><span class="n">MyTrait</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Vec</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>and I change it to <code>impl&lt;T&gt; MyTrait for Vec&lt;T&gt;</code>, this is effectively introducing a new blanket impl, and that is not a semver compliant change (see <a href="https://rust-lang.github.io/rfcs/2451-re-rebalancing-coherence.html">RFC 2451</a> for more details).</p>
<h2 id="summarize">Summarize</h2>
<p>So, to summarize:</p>
<ul>
<li>Perfect derive is great, but it reveals details about your fields—- sure, you can clone your <code>List&lt;T&gt;</code> for any type <code>T</code> now, but maybe you want the right to require <code>T: Clone</code> in the future?</li>
<li>Expanded implied bounds are great, but they prevent you from “relaxing” your requirements in the future— sure, you only ever have a <code>RedBlackTree&lt;T&gt;</code> for <code>T: Ord</code> now, but maybe you want to support more types in the future?</li>
<li>But also: the rules around semver compliance are rather subtle and quick to anger.</li>
</ul>
<h2 id="how-can-we-fix-these-features">How can we fix these features?</h2>
<p>I see a few options. The most obvious of course is to just accept the semver hazards. It’s not clear to me whether they will be a problem in practice, and Rust already has a number of similar hazards (e.g., adding a <code>Box&lt;dyn Write&gt;</code> makes your type no longer <code>Send</code>).</p>
<h2 id="another-extreme-alternative-crate-local-implied-bounds">Another extreme alternative: crate-local implied bounds</h2>
<p>Another option for implied bounds would be to expand implied bounds, but only on a <em>crate-local</em> basis. Imagine that the <code>RedBlackTree</code> type is declared in some crate <code>rbtree</code>, like so…</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// The crate rbtree
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">RedBlackTree</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Ord</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="err">…</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">RedBlackTree</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">insert</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">value</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="err">…</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This impl, because it lives in the same crate as <code>RedBlackTree</code>, would be able to benefit from expanded implied bounds. Therefore, code inside the impl could assume that <code>T: Ord</code>. That’s nice. If I later remove the <code>T: Ord</code> bound from <code>RedBlackTree</code>, I can move it to the impl, and that’s fine.</p>
<p>But if I’m in some downstream crate, then I don’t benefit from implied bounds. If I were going to, say, implement some trait for <code>RedBlackTree</code>, I’d have to repeat <code>T: Ord</code>…</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">MyTrait</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">MyTrait</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">rbtrait</span>::<span class="n">RedBlackTree</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">T</span>: <span class="nb">Ord</span><span class="p">,</span><span class="w"> </span><span class="c1">// required
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="a-middle-ground-declaring-how-public-your-bounds-are">A middle ground: declaring “how public” your bounds are</h2>
<p>Another variation would be to add a <em>visibility</em> to your bounds. The default would be that where clauses on structs are “private”, i.e., implied only within your module. But you could declare where clauses as “public”, in which case you would be committing to them as part of your semver guarantee:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">RedBlackTree</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">pub</span><span class="w"> </span><span class="nb">Ord</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>In principle, we could also support <code>pub(crate)</code> and other visibility modifiers.</p>
<h2 id="explicit-perfect-derive">Explicit perfect derive</h2>
<p>I’ve been focused on implied bounds, but the same questions apply to perfect derive. In that case, I think the question is mildly simpler— we likely want some way to expand the perfect derive syntax to “opt in” to the perfect version (or “opt out” from it).</p>
<p>There have been some proposals that would allow you to be explicit about which parameters require which bounds. I’ve been a fan of those, but now that I’ve realized we can do perfect derive, I’m less sure. Maybe we should just want some way to say “add the bounds all the time” (the default today) or “use perfect derive” (the new option), and that’s good enough. We could even make there be a new attribute, e.g. <code>#[perfect_derive(…)]</code> or <code>#[semver_derive]</code>. Not sure.</p>
<h2 id="conclusion">Conclusion</h2>
<p>In the past, we were blocked for technical reasons from expanding implied bounds and supporting perfect derive, but I believe we have resolved those issues. So now we have to think a bit about semver and decide how much explicit we want to be.</p>
<p>Side not that, no matter what we pick, I think it would be great to have easy tooling to help authors determine if something is a semver breaking change. This is a bit tricky because it requires reasoning about two versions of your code. I know there is <a href="https://github.com/rust-lang/rust-semverver">rust-semverer</a>  but I’m not sure how well maintained it is. It’d be great to have a simple github action one could deploy that would warn you when reviewing PRs.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Rules regarding semver are documented <a href="https://doc.rust-lang.org/cargo/reference/semver.html">here</a>, by the way.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Actually, you were wrong before: changing the types of private fields in Rust can already be a breaking change, as we discussed earlier (e.g., by introducing a <code>Rc</code>, which makes the type no longer implement <code>Send</code>).&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Uh, no promises — there may be some edge cases, particularly involving regions, where this is not true today. I should experiment.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><title type="html">dyn*: can we make dyn sized?</title><link href="https://smallcultfollowing.com/babysteps/blog/2022/03/29/dyn-can-we-make-dyn-sized/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2022/03/29/dyn-can-we-make-dyn-sized/</id><published>2022-03-29T00:00:00+00:00</published><updated>2022-03-29T05:32:00-04:00</updated><content type="html"><![CDATA[<p>Last Friday, tmandry, cramertj, and I had an exciting conversation. We were talking about the design for combining async functions in traits with <code>dyn Trait</code> that tmandry and I had presented to the lang team on Friday. cramertj had an insightful twist to offer on that design, and I want to talk about it here. Keep in mind that this is a piece of &ldquo;hot off the presses&rdquo;, in-progress design and hence may easily go nowhere &ndash; but at the same time, I&rsquo;m pretty excited about it. If it works out, it could go a long way towards making <code>dyn Trait</code> user-friendly and accessible in Rust, which I think would be a big deal.</p>
<h3 id="background-the-core-problem-with-dyn">Background: The core problem with dyn</h3>
<p><code>dyn Trait</code> is one of Rust’s most frustrating features. On the one hand, <code>dyn Trait</code> values are absolutely necessary. You need to be able to build up collections of heterogeneous types that all implement some common interface in order to implement core parts of the system. But working with heterogeneous types is just fundamentally <em>hard</em> because you don’t know how big they are. This implies that you have to manipulate them by pointer, and <em>that</em> brings up questions of how to manage the memory that these pointers point at. This is where the problems begin.</p>
<h3 id="problem-no-memory-allocator-in-core">Problem: no memory allocator in core</h3>
<p>One challenge has to do with how we factor our allocation. The core crate that is required for <em>all</em> Rust programs, <code>libcore</code>, doesn’t have a concept of a memory allocator. It relies purely on stack allocation. For the most part, this works fine: you can pass ownership of objects around by copying them from one stack frame to another. But it doesn’t work if you don’t know how much stack space they occupy!<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<h3 id="problem-dyn-traits-cant-really-be-substituted-for-impl-trait">Problem: Dyn traits can’t really be substituted for impl Trait</h3>
<p>In Rust today, the type <code>dyn Trait</code> is guaranteed to implement the trait <code>Trait</code>, so long as <code>Trait</code> is dyn safe. That seems pretty cool, but in practice it’s not all that useful. Consider a simple function that operates on any kind of <code>Debug</code> type:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">print_me</span><span class="p">(</span><span class="n">x</span>: <span class="nc">impl</span><span class="w"> </span><span class="n">Debug</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="fm">println!</span><span class="p">(</span><span class="err">“</span><span class="p">{</span><span class="n">x</span>:<span class="o">?</span><span class="p">}</span><span class="err">”</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Even though the <code>Debug</code> trait is dyn-safe, you can’t just change the <code>impl</code> above into a <code>dyn</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">print_me</span><span class="p">(</span><span class="n">x</span>: <span class="nc">dyn</span><span class="w"> </span><span class="n">Debug</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The problem here is that stack-allocated parameters need to have a known size, and we don’t know how big <code>dyn</code> is. The common solution is to introduce some kind of pointer, e.g. a reference:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">print_me</span><span class="p">(</span><span class="n">x</span>: <span class="kp">&amp;</span><span class="nc">dyn</span><span class="w"> </span><span class="n">Debug</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="err">…</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>That works ok for this function, but it has a few downsides. First, we have to change existing callers of <code>print_me</code> — maybe we had <code>print_me(22)</code> before, but now they have to write <code>print_me(&amp;22)</code>. That’s an ergonomic hit. Second, we’ve now hardcoded that we are <em>borrowing</em> the <code>dyn Debug</code>. There are other functions where this isn’t necessarily what we wanted to do. Maybe we wanted to store that <code>dyn Debug</code> into a datastructure and return it — for example, this function <code>print_me_later</code> returns a closure that will print <code>x</code> when called:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">print_me_later</span><span class="p">(</span><span class="n">x</span>: <span class="kp">&amp;</span><span class="nc">dyn</span><span class="w"> </span><span class="n">Debug</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">impl</span><span class="w"> </span><span class="nb">FnOnce</span><span class="p">()</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="err">‘</span><span class="n">_</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">move</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="fm">println!</span><span class="p">(</span><span class="err">“</span><span class="p">{</span><span class="n">x</span>:<span class="o">?</span><span class="p">}</span><span class="err">”</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Imagine that we wanted to spawn a thread that will invoke <code>print_me_later</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">spawn_thread</span><span class="p">(</span><span class="n">value</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">   </span><span class="kd">let</span><span class="w"> </span><span class="n">closure</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">print_me_later</span><span class="p">(</span><span class="o">&amp;</span><span class="n">value</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">   </span><span class="n">std</span>::<span class="n">thread</span>::<span class="n">spawn</span><span class="p">(</span><span class="k">move</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="n">closure</span><span class="p">());</span><span class="w"> </span><span class="c1">// &lt;— Error, ‘static bound not satisfied
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This code will not compile because <code>closure</code> references <code>value</code> on the stack. But if we had written <code>print_me_later</code> with an <code>impl Debug</code> parameter, it could take ownership of its argument and everything would work fine.</p>
<p>Of course, we could solve <em>this</em> by writing <code>print_me_later</code> to use <code>Box</code> but that’s hardcoding memory allocation. <strong>This is problematic if we want <code>print_me_later</code> to appear in a context, like libcore, that might not even have access to a memory allocator.</strong></p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">print_me_later</span><span class="p">(</span><span class="n">x</span>: <span class="nb">Box</span><span class="o">&lt;</span><span class="k">dyn</span><span class="w"> </span><span class="n">Debug</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">impl</span><span class="w"> </span><span class="nb">FnOnce</span><span class="p">()</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="err">‘</span><span class="n">_</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">move</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="fm">println!</span><span class="p">(</span><span class="err">“</span><span class="p">{</span><span class="n">x</span>:<span class="o">?</span><span class="p">}</span><span class="err">”</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p><strong>In this specific example, the <code>Box</code> is also kind of inefficient.</strong> After all, the value <code>x</code> is just a <code>usize</code>, and a <code>Box</code> is also a <code>usize</code>, so in theory we could just copy the integer around (the <code>usize</code> methods expect an <code>&amp;usize</code>, after all). This is sort of a special case, but it does come up more than you would think at the lower levels of the system, where it may be worth the trouble to try and pack things into a <code>usize</code> — there are a number of futures, for example, that don’t really require much state.</p>
<h3 id="the-idea-what-if-the-dyn-were-the-pointer">The idea: What if the dyn were the pointer?</h3>
<p>In the proposal for “async fns in traits” that tmandry and I put forward, we had introduced the idea of <code>dynx Trait</code> types. <code>dynx Trait</code> types were not an actual syntax that users would ever type; rather, they were an implementation detail. Effectively a <code>dynx Future</code> refers to a <em>pointer to a type that implements <code>Future</code></em>. They don’t hardcode that this pointer is a <code>Box</code>; instead, the vtable includes a “drop” function that knows how to release the pointer’s referent (for a <code>Box</code>, that would free the memory).</p>
<h3 id="better-idea-what-if-the-dyn-were-something-of-known-size">Better idea: What if the dyn were “something of known size”?</h3>
<p>After the lang team meeting, tmandry and I met with cramertj, who proceeded to point out to us something very insightful.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> The truth is that <code>dynx Trait</code> values don’t have to be a <em>pointer</em> to something that implemented <code>Trait</code> — they just have to be something <em>pointer-sized</em>. tmandry and I actually knew <em>that</em>, but what we didn’t see was how critically important this was:</p>
<ul>
<li>First, a number of futures, in practice, consist of very little state and can be pointer-sized. For example, reading from a file descriptor only needs to store the file descriptor, which is a 32-bit integer, since the kernel stores the other state. Similarly the future for a timer or other builtin runtime primitive often just needs to store an index.</li>
<li>Second, a <code>dynx Trait</code> lets you write code that manipulates values which may be boxed <em>without directly talking about the box</em>. This is critical for code that wants to appear in libcore or be reusable across any possible context.
<ul>
<li>As an example of something that would be much easier this way, the <code>Waker</code> struct, which lives in libcore, is effectively a <em>hand-written</em> <code>dynx Waker</code> struct.</li>
</ul>
</li>
<li>Finally, and we’ll get to this in a bit, a lot of low-level systems code employs clever tricks where they know <em>something</em> about the layout of a value. For example, you might have a vector that contains values of various types, but (a) all those types have the same size and (b) they all share a common prefix. In that case, you can manipulate fields in that prefix without knowing what kind of data is contained with, and use a vtable or discriminatory to do the rest.
<ul>
<li>In Rust, this pattern is painful to encode, though you can sometimes do it with a <code>Vec&lt;S&gt;</code> where <code>S</code> is some struct that contains the prefix fields and an enum. Enums work ok but if you have a more open-ended set of types, you might prefer to have trait objects.</li>
</ul>
</li>
</ul>
<h3 id="a-sketch-the-dyn-star-type">A sketch: The dyn-star type</h3>
<p>To give you a sense for how cool “fixed-size dyn types” could be, I’m going to start with a very simple design sketch. Imagine that we introduced a new type <code>dyn* Trait</code>, which represents the pair of:</p>
<ul>
<li>a pointer-sized value of some type <code>T</code> that implements Trait (the <code>*</code> is meant to convey “pointer-sized”<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>)</li>
<li>a vtable for <code>T: Trait</code>; the drop method in the vtable drops the <code>T</code> value.</li>
</ul>
<p>For now, don’t get too hung up on the specific syntax. There’s plenty of time to bikeshed, and I’ll talk a bit about how we might truly phase in something like <code>dyn*</code>. For now let’s just talk about what it would be like to use it.</p>
<h3 id="creating-a-dyn">Creating a dyn*</h3>
<p>To coerce a value of type <code>T</code> into a <code>dyn* Trait</code>, two constraints must be met:</p>
<ul>
<li>The type <code>T</code> must be pointer-sized or smaller.</li>
<li>The type <code>T</code> must implement <code>Trait</code></li>
</ul>
<h3 id="converting-an-impl-to-a-dyn">Converting an <code>impl</code> to a <code>dyn*</code></h3>
<p>Using <code>dyn*</code>, we can convert <code>impl Trait</code> directly to <code>dyn* Trait</code>. This works fine, because <code>dyn* Trait</code> is <code>Sized</code>. To be truly equivalent to <code>impl Trait</code>, you do actually want a lifetime bound, so that the <code>dyn*</code> can represent references too:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// fn print_me(x: impl Debug) {…} becomes
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">print_me</span><span class="p">(</span><span class="n">x</span>: <span class="nc">dyn</span><span class="o">*</span><span class="w"> </span><span class="n">Debug</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="err">‘</span><span class="n">_</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="fm">println!</span><span class="p">(</span><span class="err">“</span><span class="p">{</span><span class="n">x</span>:<span class="o">?</span><span class="p">}</span><span class="err">”</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">print_me_later</span><span class="p">(</span><span class="n">x</span>: <span class="nc">dyn</span><span class="o">*</span><span class="w"> </span><span class="n">Debug</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="err">‘</span><span class="n">_</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">impl</span><span class="w"> </span><span class="nb">FnOnce</span><span class="p">()</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="err">‘</span><span class="n">_</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">move</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="fm">println!</span><span class="p">(</span><span class="err">“</span><span class="p">{</span><span class="n">x</span>:<span class="o">?</span><span class="p">}</span><span class="err">”</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>These two functions can be directly invoked on a <code>usize</code> (e.g., <code>print_me_later(22)</code> compiles). What’s more, they work on references (e.g., <code>print_me_later(&amp;some_type)</code>) or boxed values <code>print_me_later(Box::new(some_type))</code>).</p>
<p>They are also suitable for inclusion in a no-std project, as they don’t directly reference an allocator. Instead, when the <code>dyn*</code> is dropped, we will invoke its destructor from the vtable, which might wind up deallocating memory (but doesn’t have to).</p>
<h3 id="more-things-are-dyn-safe-than-dyn-safe">More things are dyn* safe than dyn safe</h3>
<p>Many things that were hard for <code>dyn Trait</code> values are trivial for <code>dyn* Trait</code> values:</p>
<ul>
<li>By-value <code>self</code> methods work fine: a <code>dyn* Trait</code> value is sized, so you can move ownership of it just by copying its bytes.</li>
<li>Returning <code>Self</code>, as in the <code>Clone</code> trait, works fine.
<ul>
<li>Similarly, the fact that <code>trait Clone: Sized</code> doesn’t mean that <code>dyn* Clone</code> can’t implement <code>Clone</code>, although it does imply that <code>dyn Clone: Clone</code> cannot hold.</li>
</ul>
</li>
<li>Function arguments of type <code>impl ArgTrait</code> can be converted to <code>dyn* ArgTrait</code>, so long as <code>ArgTrait</code> is dyn*-safe</li>
<li>Returning an <code>impl ArgTrait</code> can return a <code>dyn* ArgTrait</code>.</li>
</ul>
<p>In short, a large number of the barriers that make traits “not dyn-safe” don’t apply to <code>dyn*</code>. Not all, of course. Traits that take parameters of type <code>Self</code> won’t work (we don’t know that two <code>dyn* Trait</code> types have the same underlying type) and we also can’t support generic methods in many cases (we wouldn’t know how to monomorphize)<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup>.</p>
<h3 id="a-catch-dyn-foo-requires-boximpl-foo-foo-and-friends">A catch: <code>dyn* Foo</code> requires <code>Box&lt;impl Foo&gt;: Foo</code> and friends</h3>
<p>There is one catch from this whole setup, but I like to think of it is as an opportunity. In order to create a <code>dyn* Trait</code> from a pointer type like <code>Box&lt;Widget&gt;</code>, you need to know that <code>Box&lt;Widget&gt;: Trait</code>, whereas creating a <code>Box&lt;dyn Trait&gt;</code> just requires knowing that <code>Widget: Trait</code> (this follows directly from the fact that the <code>Box</code> is now part of the hidden type).</p>
<p>At the moment, annoyingly, when you define a trait you don’t automatically get any sort of impls for “pointers to types that implement the trait”. Instead, people often define such traits automatically — for example, the <code>Iterator</code> trait has impls like</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">I</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">I</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">I</span>: <span class="o">?</span><span class="nb">Sized</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Iterator</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">I</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Box</span><span class="o">&lt;</span><span class="n">I</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">I</span>: <span class="o">?</span><span class="nb">Sized</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Iterator</span><span class="w">
</span></span></span></code></pre></div><p>Many people forget to define such impls, however, which can be annoying in practice (and not just when using dyn).</p>
<p>I’m not totally sure the best way to fix this, but I view it as an opportunity because if we <em>can</em> supply such impls, that would make Rust more ergonomic overall.</p>
<p>One interesting thing: the impls for <code>Iterator</code> that you see above include <code>I: ?Sized</code>, which makes them applicable to <code>Box&lt;dyn Iterator&gt;</code>. But with <code>dyn* Iterator</code>, we are starting from a <code>Box&lt;impl Iterator&gt;</code> type — in other words, the <code>?Sized</code> bound is not <em>necessary</em>, because we are creating our “dyn” abstraction around the pointer, which is sized. (The <code>?Sized</code> is not harmful, either, of course, and if we auto-generate such impls, we should include it so that they apply to old-style <code>dyn</code> as well as slice types like <code>[u8]</code>.)</p>
<h3 id="another-catch-shared-subsets-of-traits">Another catch: “shared subsets” of traits</h3>
<p>One of the cool things about Rust’s <code>Trait</code> design is that it allows you to combine “read-only” and “modifier” methods into one trait, as in this example:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">WidgetContainer</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">num_components</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">add_component</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">c</span>: <span class="nc">WidgetComponent</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>I can write a function that takes a <code>&amp;mut dyn WidgetContainer </code> and it will be able to invoke both methods. If that function takes <code>&amp;dyn WidgetContainer </code> instead, it can only invoke <code>num_components</code>.</p>
<p>If we don’t do anything else, this flexibility is going to be lost with <code>dyn*</code>. Imagine that we wish to create a <code>dyn* WidgetContainer </code> from some <code>&amp;impl WidgetContainer </code> type. To do that, we would need an impl of <code>WidgetContainer </code> for <code>&amp;T</code>, but we can’t write that code, at least not without panicking:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">W</span><span class="o">&gt;</span><span class="w"> </span><span class="n">WidgetContainer</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="o">&amp;</span><span class="n">W</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">W</span>: <span class="nc">WidgetContainer</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">num_components</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">W</span>::<span class="n">num_components</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="c1">// OK
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">add_component</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">c</span>: <span class="nc">WidgetComponent</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">W</span>::<span class="n">add_component</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">c</span><span class="p">)</span><span class="w"> </span><span class="c1">// Error!
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This problem is not specific to <code>dyn</code> — imagine I have some code that just invokes <code>num_components</code> but which can be called with a <code>&amp;W</code> or with a <code>Rc&lt;W&gt;</code> or with other such types. It’s kind of awkward for me to write a function like that now: the easiest way is to hardcode that it takes <code>&amp;W</code> and then lean on deref-coercions in the caller.</p>
<p>One idea that tmandry and I have been kicking around is the idea of having “views” on traits. The idea would be that you could write something like <code>T: &amp;WidgetContainer </code> to mean “the <code>&amp;self</code> methods of <code>WidgetContainer</code>”. If you had this idea, then you could certainly have</p>
<pre tabindex="0"><code>impl&lt;W&gt; &amp;WidgetContainer for &amp;W
where
    W: WidgetContainer
</code></pre><p>because you would only need to define <code>num_components</code> (though I would hope you don’t have to write such an impl by hand).</p>
<p>Now, instead of taking a <code>&amp;dyn WidgetContainer</code>, you would take a <code>dyn &amp;WidgetContainer</code>. Similarly, instead of taking an <code>&amp;impl WidgetContainer</code>, you would probably be better off taking a <code>impl &amp;WidgetContainer</code> (this has some other benefits too, as it happens).</p>
<h3 id="a-third-catch-dyn-safety-sometimes-puts-constraints-on-impls-not-just-the-trait-itself">A third catch: dyn safety sometimes puts constraints on impls, not just the trait itself</h3>
<p>Rust’s current design assumes that you have a single trait definition and we can determine from that trait definition whether or not the trait ought to be dyn safe. But sometimes there are constraints around dyn safety that actually don’t affect the <em>trait</em> but only the <em>impls</em> of the trait. That kind of situation doesn’t work well with “implicit dyn safety”: if you determine that the trait is dyn-safe, you have to impose those limitations on its impls, but maybe the trait wasn’t <em>meant</em> to be dyn-safe.</p>
<p>I think overall it would be better if traits explicitly declared their intent to be dyn-safe or not. The most obvious way to do that would be with a declaration like <code>dyn trait</code>:</p>
<pre tabindex="0"><code>dyn trait Foo { }
</code></pre><p>As a nice side benefit, a declaration like this could also auto-generate impls like <code>impl Foo for Box&lt;impl Foo + ?Sized&gt;</code> and so forth. It would also mean that dyn-safety becomes a semver guarantee.</p>
<p>My main <em>concern</em> here is that I suspect <em>most</em> traits could and should be dyn-safe. I think I’d prefer if one had to <em>opt out</em> from dyn safety instead of <em>opting in</em>. I don’t know what the syntax for that would be, of course, and we’d have to deal with backwards compatibility.</p>
<h3 id="phasing-things-in-over-an-edition">Phasing things in over an edition</h3>
<p>If we could start over again, I think I would approach <code>dyn</code> like this:</p>
<ul>
<li>The syntax <code>dyn Trait</code> means a pointer-sized value that implements <code>Trait</code>. Typically a <code>Box</code> or <code>&amp;</code> but sometimes other things.</li>
<li>The syntax <code>dyn[T] Trait</code> means “a value that is layout-compatible with T that implements <code>Trait</code>”; <code>dyn Trait</code> is thus sugar for <code>dyn[*const ()] Trait</code>, which we might write more compactly as <code>dyn* Trait</code>.</li>
<li>The syntax <code>dyn[T..] Trait</code> means “a value that starts with a prefix of <code>T</code> but has unknown size and implements <code>Trait</code>.</li>
<li>The syntax <code>dyn[..] Trait</code> means “some unknown value of a type that implements <code>Trait</code>”.</li>
</ul>
<p>Meanwhile, we would extend the grammar of a trait bound with some new capabilities:</p>
<ul>
<li>A bound like <code>&amp;Trait&lt;P…&gt;</code> refers to “only the <code>&amp;self</code> methods from <code>Trait</code>”;</li>
<li>A bound like <code>&amp;mut Trait&lt;P…&gt;</code> refers to “only the <code>&amp;self</code> and <code>&amp;mut self</code> methods from <code>Trait</code>”;
<ul>
<li>Probably this wants to include <code>Pin&lt;&amp;mut Self&gt;</code> too? I’ve not thought about that.</li>
</ul>
</li>
<li>We probably want a way to write a bound like <code>Rc&lt;Trait&lt;P…&gt;&gt;</code> to mean <code>self: Rc&lt;Self&gt;</code> and friends, but I don’t know what that looks like yet. Those kinds of traits are quite unusual.</li>
</ul>
<p>I would expect that most people would just learn <code>dyn Trait</code>. The use cases for the <code>dyn[]</code> notation are far more specialized and would come later.</p>
<p>Interestingly, we could phase in this syntax in Rust 2024 if we wanted. The idea would be that we move existing uses of <code>dyn</code> to the explicit form in prep for the new edition:</p>
<ul>
<li><code>&amp;dyn Trait</code>, for example, would become <code>dyn* Trait + ‘_</code></li>
<li><code>Box&lt;dyn Trait&gt;</code> would become <code>dyn* Trait</code> (note that a <code>’static</code> bound is implied today; this might be worth reconsidering, but that’s a separate question).</li>
<li>other uses of <code>dyn Trait</code> would become <code>dyn[…] Trait</code></li>
</ul>
<p>Then, in Rust 2024, we would rewrite <code>dyn* Trait</code> to just <code>dyn Trait</code> with an “edition idom lint”.</p>
<h3 id="conclusion">Conclusion</h3>
<p>Whew! This was a long post. Let me summarize what we covered:</p>
<ul>
<li>If <code>dyn Trait</code> encapsulated <em>some value of pointer size that implements <code>Trait</code></em> and not <em>some value of unknown size</em>:
<ul>
<li>We could expand the set of things that are dyn safe by quite a lot without needing clever hacks:
<ul>
<li>methods that take by-value self: <code>fn into_foo(self, …)</code></li>
<li>methods with parameters of impl Trait type (as long as <code>Trait</code> is dyn safe): <code>fn foo(…, impl Trait, …)</code></li>
<li>methods that return impl Trait values: <code>fn iter(&amp;self) -&gt; impl Iterator</code></li>
<li>methods that return <code>Self</code> types: <code>fn clone(&amp;self) -&gt; Self</code></li>
</ul>
</li>
</ul>
</li>
<li>That would raise some problems we have to deal with, but all of them are things that would be useful anyway:
<ul>
<li>You’d need <code>dyn &amp;Trait</code> and things to “select” sets of methods.</li>
<li>You’d need a more ergonomic way to ensure that <code>Box&lt;Trait&gt;: Trait</code> and so forth.</li>
</ul>
</li>
<li>We could plausibly transition to this model for Rust 2024 by introducing two syntaxes, <code>dyn*</code> (pointer-sized) and <code>dyn[..]</code> (unknown size) and then changing what <code>dyn</code> means.</li>
</ul>
<p>There are a number of details to work out, but among the most prominent are:</p>
<ul>
<li>Should we declare dyn-safe traits explicitly? (I think yes)
<ul>
<li>What “bridging” impls should we create when we do so? (e.g., to cover <code>Box&lt;impl Trait&gt;: Trait</code> etc)</li>
</ul>
</li>
<li>How exactly do <code>&amp;Trait</code> bounds work — do you get impls automatically? Do you have to write them?</li>
</ul>
<h3 id="appendix-a-going-even-more-crazy-dynt-for-arbitrary-prefixes">Appendix A: Going even more crazy: <code>dyn[T]</code> for arbitrary prefixes</h3>
<p><code>dyn*</code> is pretty useful. But we could actually generalize it. You could imagine writing <code>dyn[T]</code> to mean “a value whose layout can be read as <code>T</code>. What we’ve called <code>dyn* Trait</code> would thus be equivalent to <code>dyn[*const ()] Trait</code>. This more general version allows us to package up larger values — for example, you could write <code>dyn[[usize; 2]] Trait</code> to mean a “two-word value”.</p>
<p>You could even imagine writing <code>dyn[T]</code> where the <code>T</code> meant that you can safely access the underlying value as a <code>T</code> instance. This would give access to common fields that the implementing type must expose or other such things. Systems programming hacks often lean on clever things like this. This would be a bit tricky to reconcile with cases where the <code>T</code> is a type like <code>usize</code> that is just indicating how many bytes of data there are, since if you are going to allow the <code>dyn[T]</code> to be treated like a <code>&amp;mut T</code> the user could go crazy overwriting values in ways that are definitely not valid. So we’d have to think hard about this to make it work, that’s why I left it for an Appendix.</p>
<h3 id="appendix-b-the-other-big-problems-with-dyn">Appendix B: The &ldquo;other&rdquo; big problems with dyn</h3>
<p>I think that the designs in this post address a number of the big problems with dyn:</p>
<ul>
<li>You can&rsquo;t use it like impl</li>
<li>Lots of useful trait features are not dyn-safe</li>
<li>You have to write <code>?Sized</code> on impls to make them work</li>
</ul>
<p>But it leaves a few problems unresolved. One of the biggest to my mind is the interaction with auto traits (and lifetimes, actually). With generic parameters like <code>T: Debug</code>, I don&rsquo;t have to talk explicitly about whether <code>T</code> is <code>Send</code> or not or whether <code>T</code> contains lifetimes. I can just write write a generic type like <code>struct MyWriter&lt;W&gt; where W: Write { w: W, ... }</code>. Users of <code>MyWriter</code> know what <code>W</code> is, so they can determine whether or not <code>MyWriter&lt;Foo&gt;: Send</code> based on whether <code>Foo: Send</code>, and they also can understand that <code>MyWriter&lt;&amp;'a Foo&gt;</code> includes references with the lifetime <code>'a</code>. In contrast, if we did <code>struct MyWriter { w: dyn* Write, ... }</code>, that <code>dyn* Write</code> type is <em>hiding</em> the underlying data. As Rust currently stands, it implies that <code>MyWriter</code> it <em>not</em> <code>Send</code> and that it does <em>not</em> contain references. We don&rsquo;t have a good way for <code>MyWriter</code> to declare that it is &ldquo;send if the writer you gave me is send&rdquo; <em>and</em> use <code>dyn*</code>. That&rsquo;s an interesting problem! But orthogonal, I think, from the problems addressed in this blog post.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>But, you are thinking, what about alloca? The answer is that alloca isn’t really a good option. For one thing, it doesn’t work on all targets, but in particular it doesn’t work for async functions, which require a fixed size stack frame. It also doesn&rsquo;t let you return things back <em>up</em> the stack, at least not easily.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Also, cramertj apparently had this idea a long time back but we didn’t really understand it. Ah well, sometimes it goes like that — you have to reinvent something to realize how brilliant the original inventor really was.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>In truth, I also just think “dyn-star” sounds cool. I’ve always been jealous of the A* algorithm and wanted to name something in a similar way. Now’s my chance! Ha ha!&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>Obviously, we would be lifting this partly to accommoate <code>impl Trait</code> arguments. I think we could lift this restriction in more cases but it’s going to take a bit more design.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/dyn-async-traits" term="dyn-async-traits" label="Dyn async traits"/></entry><entry><title type="html">Dare to ask for more #rust2024</title><link href="https://smallcultfollowing.com/babysteps/blog/2022/02/09/dare-to-ask-for-more-rust2024/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2022/02/09/dare-to-ask-for-more-rust2024/</id><published>2022-02-09T00:00:00+00:00</published><updated>2022-02-09T14:52:00-05:00</updated><content type="html"><![CDATA[<p>Last year, we shipped <a href="https://blog.rust-lang.org/2021/10/21/Rust-1.56.0.html">Rust 2021</a> and I have found the changes to be a real improvement in usability. Even though the actual changes themselves were quite modest, the combination of <a href="https://doc.rust-lang.org/edition-guide/rust-2021/disjoint-capture-in-closures.html">precise capture closure</a> and <a href="https://doc.rust-lang.org/edition-guide/rust-2021/panic-macro-consistency.html">simpler formatting strings</a> (<code>println!(&quot;{x:?}&quot;)</code> instead of <code>println!(&quot;{:?}&quot;, x)</code>) is making a real difference in my &ldquo;day to day&rdquo; life.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> Just like <a href="http://blog.pnkfx.org/blog/2019/06/26/breaking-news-non-lexical-lifetimes-arrives-for-everyone/">NLL</a> and the <a href="https://doc.rust-lang.org/nightly/edition-guide/rust-2018/path-changes.html">new module system</a> from <a href="https://doc.rust-lang.org/edition-guide/rust-2018/index.html">Rust 2018</a>, I&rsquo;ve quickly adapted to these new conventions. When I go back to older code, with its clunky borrow checker workarounds and format strings, I die a little inside.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></p>
<p>As we enter 2022, I am finding my thoughts turning more and more to the next Rust edition. What do I want from Rust, and the Rust community, over the next few years? To me, the theme that keeps coming to mind is <strong>dare to ask for more</strong>. Rust has gotten quite a bit nicer to use over the last few years, but I am not satisfied. I believe that there is room for Rust to be 22x more productive<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup> and easy to use than it is today, and I think we can do it without meaningfully sacrificing <a href="https://rustacean-principles.netlify.app/how_rust_empowers/reliable.html">reliability</a>, <a href="https://rustacean-principles.netlify.app/how_rust_empowers/performant.html">performance</a>, or <a href="https://rustacean-principles.netlify.app/how_rust_empowers/versatile.html">versatility</a>.</p>
<h2 id="daring-to-ask-for-a-more-ergonomic-expressive-rust">Daring to ask for a more ergonomic, expressive Rust</h2>
<p>As Rust usage continues to grow, I have been able to talk to quite a number of Rust users with a wide variety of backgrounds and experience. One of the themes I like to ask about is their experience of learning Rust. In many ways, the story here is much better than I had anticipated. Most people are able to learn Rust and feel productive in 3-6 months. Moreover, once they get used to it, most people seem to really enjoy it, and they talk about how learning ownership rules influences the code they write in other languages too (for the better). They also talk about experiencing far fewer bugs in Rust than in other languages &ndash; this is true for C++<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup>, but it&rsquo;s also true for things written in Java or other languages<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup>.</p>
<p>That said, it&rsquo;s also quite clear that using Rust has a significant <a href="https://blog.thegovlab.org/post/a-new-vocabulary-for-the-21st-century-cognitive-overhead">cognitive overhead</a>. Few Rust users feel like true experts<sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup>. There are a few topics &ndash; &ldquo;where clauses&rdquo;, &ldquo;lifetimes&rdquo; &ndash; that people mention over and over as being confusing. The more I talk to people, the more I get the sense that the problem isn&rsquo;t any <em>one</em> thing, it&rsquo;s <em>all</em> the things. It&rsquo;s having to juggle a lot of concerns all at once, and having to get everything lined up before one can even see your code run.</p>
<p>These interviews really validate the work we did on the <a href="https://blog.rust-lang.org/2017/03/02/lang-ergonomics.html">ergonomics initiative</a> and also in Rust 2021. One person I spoke to said the following:</p>
<blockquote>
<p>Looking backwards, NLL and match ergonomics were major improvements in getting people to learn Rust. A lot of people suddenly found stuff way easier. NLL made a lot of things with regard to mutability much simpler. One remaining thing coming up is disjoint capture of fields in closures. That’s another example where people just didn’t understand, &ldquo;why is this compiler yelling at me? This should work?&rdquo;</p>
</blockquote>
<p>As happy as I am with those results, I don&rsquo;t think we&rsquo;re done. I would like to see progress in two different dimensions:</p>
<p><strong>Fundamental simplifications:</strong> These are changes like NLL or disjoint-closure-capture that just change the game in terms of what the compiler can accept. Even though these kinds of changes often make the analysis more complex, they ultimately make the language <em>feel</em> simpler: more of the programs that <em>should</em> work actually <em>do</em> work. Simplifications like this tend not to be particularly controversial, but they are difficult to design and implement. Often they require an edition because of small changes to language semantics in various edge cases.</p>
<p>One of the simplest improvements here would be landing polonius, which would fix <a href="https://github.com/rust-lang/rust/issues/47680">#47680</a>, a pattern that I see happening with some regularity. I think that there are also language extensions, like <a href="https://tmandry.gitlab.io/blog/posts/2021-12-21-context-capabilities/">scoped contexts</a>, some kind of <a href="https://smallcultfollowing.com/babysteps//blog/2021/11/05/view-types/">view types</a>, specialization, or some way to manage self-referential structs, that could fit in this category. That&rsquo;s a bit trickier. The language grows, which is not a simplification, but it can make common patterns so much simpler than it&rsquo;s a net win.</p>
<p><strong>Sanding rough edges.</strong> These are changes that just make writing Rust code <em>easier</em>. There are fewer &ldquo;i&rsquo;s to dot&rdquo; or &ldquo;t&rsquo;s to cross&rdquo;. Good examples are lifetime elision. You know you are hitting a rough edge when you find yourself blindly following compiler suggestions, or randomly adding an <code>&amp;</code> or a <code>*</code> here or there to see if it will make the compiler happy.</p>
<p>While sanding rough edges can benefit everyone, the impact is largest for newcomers. Experienced folks have a bit of &ldquo;survival bias&rdquo;. They tend to know the tricks and apply them automatically. Newcomers don&rsquo;t have that benefit and can waste quite a lot of time (or just give up entirely) trying to fix some simple compilation errors.</p>
<p><a href="https://rust-lang.github.io/rfcs/2005-match-ergonomics.html">Match ergonomics</a> was a recent change in this category: while I believe it was an improvement, it also gave rise to a number of rough edges, particularly around references to copy types (see <a href="https://github.com/rust-lang/rust/issues/44619">#44619</a> for more discussion). I&rsquo;d like to see us fix those, and also fix &ldquo;rough edges&rdquo; in other areas, like <a href="https://rust-lang.github.io/rfcs/2089-implied-bounds.html">implied bounds</a>.</p>
<h2 id="daring-to-ask-for-a-more-ergonomic-expressive-async-rust">Daring to ask for a more ergonomic, expressive <em>async</em> Rust</h2>
<p>Going along with the previous bullet, I think we still have quite a bit of work to do before using Async Rust feels natural. Tyler Mandry and I recently wrote a post on the &ldquo;Inside Rust&rdquo; blog, <a href="https://blog.rust-lang.org/inside-rust/2022/02/03/async-in-2022.html">Async Rust in 2022</a>, that sketched both the way we want async Rust to feel (&ldquo;just add async&rdquo;) and the plan to get there.</p>
<p>It seems clear that highly concurrent applications are a key area where Rust shines, so it makes sense for us to continue investing heavily in this area. What&rsquo;s more, those investments benefit more than just async Rust users. Many of them are fundamental extensions to Rust, like <a href="https://blog.rust-lang.org/2021/08/03/GATs-stabilization-push.html">generic associated types</a><sup id="fnref:7"><a href="#fn:7" class="footnote-ref" role="doc-noteref">7</a></sup> or <a href="https://rust-lang.github.io/impl-trait-initiative/explainer/tait.html">type alias impl trait</a><sup id="fnref:8"><a href="#fn:8" class="footnote-ref" role="doc-noteref">8</a></sup>, which ultimately benefit everyone.</p>
<p>Having a truly great async Rust experience, however, is going to require more than language extensions. It&rsquo;s also going to require better tooling, like <a href="https://tokio.rs/blog/2021-12-announcing-tokio-console">tokio console</a>, and more efforts at standardization, like the <a href="https://www.ncameron.org/blog/portable-and-interoperable-async-rust/">portability and interoperability effort</a> led by nrc.</p>
<h2 id="daring-to-ask-for-a-more-ergonomic-expressive-unsafe-rust">Daring to ask for a more ergonomic, expressive <em>unsafe</em> Rust</h2>
<p>Strange as it sounds, part of what makes Rust as <em>safe</em> as it is is the fact that Rust supports <em>unsafe</em> code. Unsafe code allows Rust programmers to gain access access to the full range of machine capabilities, which is what allows Rust to be <a href="https://rustacean-principles.netlify.app/how_rust_empowers/versatile.html">versatile</a>. Rust programmers can then use ownership/borrowing to encapsulate those raw capabilities in a safe interface, so that clients of that library can <a href="https://rustacean-principles.netlify.app/how_rust_empowers/reliable.html">rely</a> on things working correctly.</p>
<p>There are some flies in the unsafe ointment, though. The reality is that writing <em>correct</em> unsafe Rust code can be quite difficult.<sup id="fnref:9"><a href="#fn:9" class="footnote-ref" role="doc-noteref">9</a></sup> In fact, because we&rsquo;ve never truly defined the set of rules that unsafe code authors have to follow, you could even say it is <em>literally</em> impossible, since there is no way to know if you are doing it correctly if nobody has defined what correct <em>is</em>.</p>
<p>To be clear, we do have a lot of promising work here! <a href="https://plv.mpi-sws.org/rustbelt/stacked-borrows/">Stacked borrows</a>, for example, looks to be awfully close to a viable approach for the aliasing rules. The rules are implemented in <a href="https://github.com/rust-lang/miri">miri</a> and a lot of folks are <a href="https://pramode.in/2020/11/08/miri-detect-ub-rust/">using that</a> to check their unsafe code. Finally, the <a href="https://rust-lang.github.io/unsafe-code-guidelines/">unsafe code guidelines</a> effort made good progress on documenting layout guarantees and other aspects of unsafe code, though that work was never RFC&rsquo;d or made normative. (The issues on that repo also contain a lot of great discussion.)</p>
<p>I think it&rsquo;s time we paid good attention to the full experience of writing unsafe code. We need to be sure that people can write unsafe Rust abstractions that are correct. This means, yes, that we need to invest in defining the rules they have to follow. I think we also need to invest time in making correct unsafe Rust code more <em>ergonomic</em> to write. Unsafe Rust today often involves a lot of annotations and casts that don&rsquo;t necessarily add much to the code<sup id="fnref:10"><a href="#fn:10" class="footnote-ref" role="doc-noteref">10</a></sup>. There are also some core features, like method dispatch with a raw pointer, that don&rsquo;t work, as well as features (like <a href="https://github.com/rust-lang/rfcs/issues/381">unsafe fields</a>) that would help in ensuring unsafe guarantees are met.</p>
<h2 id="daring-to-ask-for-a-richer-more-interactive-experience-from-rusts-tooling">Daring to ask for a richer, more interactive experience from Rust&rsquo;s tooling</h2>
<p>Tooling has a huge impact on the experience of using Rust, both as a learner and as a power user. I maintain that the the hassle-free experience of <a href="https://rustup.rs/">rustup</a> and <a href="https://doc.rust-lang.org/cargo/">cargo</a> has done as much for Rust&rsquo;s adoption as our safety guarantees &ndash; maybe more. The quality of the compiler&rsquo;s error messages comes up in virtually every single conversation I have, and I&rsquo;ve lost count of how many people cite <a href="https://github.com/rust-lang/rust-clippy">clippy</a> and <a href="https://rust-lang.github.io/rustfmt/?version=v1.4.38&amp;search=">rustfmt</a> as a key part of their onboarding process for new developers. Furthermore, after many years of ridiculously hard work, Rust&rsquo;s IDE support is starting to be <em>really, really good</em>. Major kudos to both the <a href="https://rust-analyzer.github.io/">rust-analyzer</a> and <a href="https://www.jetbrains.com/rust/">IntelliJ Rust</a> teams.</p>
<p><strong>And yet, because I&rsquo;m greedy, I want more. I want Rust to continue its tradition of &ldquo;groundbreakingly good&rdquo; tooling.</strong> I want you to be able to write <code>cargo test --debug</code> and have your test failures show up automatically in an omniscient debugger that lets you easily determine what happened<sup id="fnref:11"><a href="#fn:11" class="footnote-ref" role="doc-noteref">11</a></sup>. I want profilers that serve up an approachable analysis of where you are burning CPU or allocating memory. I want it to be trivial to &ldquo;up your game&rdquo; when it comes to reliability by applying best practices like analyzing and improving code coverage or using a fuzzer to produce inputs.</p>
<p>I&rsquo;m especially interested in tooling that changes the &ldquo;fundamental relationship&rdquo; between the Rust programmer and their programs. The difference between fixing compilation bugs in a modern Rust IDE and using <code>rustc</code> is a good illustration of this. In an IDE, you have the freedom to pick and choose which errors to fix and in which order, and the IDEs are getting good enough these days that this works quite well. Feedback is swift. This can be a big win.</p>
<p>I think we can do more like this. I would like to see people learning how the borrow checker works by &ldquo;stepping through&rdquo; code that doesn&rsquo;t pass the borrow check, seeing the kinds of memory safety errors that can occur if that code were to execute. Or perhaps &ldquo;debugging&rdquo; trait resolution failures or other complex errors in a more interactive fashion. <a href="https://github.com/rust-lang/rust-artwork/blob/master/2017-RustConf/Rust_Lucy%20Art_A.svg">The sky&rsquo;s the limit.</a></p>
<h2 id="daring-to-ask-for-richer-tooling-for-unsafe-rust">Daring to ask for richer tooling <em>for unsafe Rust</em></h2>
<p>One area where improved tooling could be particularly important is around &ldquo;unsafe&rdquo; Rust. If we really want people to write unsafe Rust code that is correct in practice &ndash; and I do! &ndash; they are going to need help. Just as with all Rust tooling, I think we need to cover the basics, but I also think we can go beyond that. We definitely need sanitizers, for example, but rather than just detecting errors, we can connect those sanitizers to debuggers and use that error as an opportunity to <em>teach people how stacked borrows works</em>. We can build better testing frameworks that make things like fuzzing and property-based testing easy. And we can offer strong support for <a href="https://github.com/rust-formal-methods/wg">formal methods</a>, to support libraries that want to invest the time can give higher levels of assurance (the standard library seems like a good candidate, for example).</p>
<h2 id="conclusion-we-got-this">Conclusion: we got this</h2>
<p>As Rust sees more success, it becomes harder and harder to make changes. There&rsquo;s more and more Rust code out there and continuity and stability can sometimes be more important than fixing something that&rsquo;s broken. And even when you do decide to make a change, everybody has opinions about how you should be doing it differently &ndash; worse yet, sometimes they&rsquo;re right.<sup id="fnref:12"><a href="#fn:12" class="footnote-ref" role="doc-noteref">12</a></sup> It can sometimes be very tempting to say, &ldquo;Rust is good enough, you don&rsquo;t want one language for everything anyway&rdquo; and leave it at that.</p>
<p>For Rust 2024, I don&rsquo;t want us to do that. I think Rust is awesome. But I think Rust could be <em>awesomer</em>. We definitely shouldn&rsquo;t go about making changes &ldquo;just because&rdquo;, we have to respect the work we&rsquo;ve done before, and we have to be realistic about the price of churn. But we should be planning and dreaming as though the current crop of Rust programmers is just the beginning &ndash; as though the vast majority of Rust programs are yet to be written (which they are).</p>
<p>My hope is that for RustConf 2024, people will be bragging to each other about the hardships they endured back in the day. &ldquo;Oh yeah,&rdquo; they&rsquo;ll say, &ldquo;I was writing async Rust back in the old days. You had to grab a random crate from crates.io for every little thing you want to do. You want to use an async fn in a trait? Get a crate. You want to write an iterator that can await? Get a crate. People would come to standup after 5 days of hacking and be like &lsquo;I finally got the code to compile!&rsquo; And we walked to work uphill in the snow! Both ways! In the summer!&rdquo;<sup id="fnref:13"><a href="#fn:13" class="footnote-ref" role="doc-noteref">13</a></sup></p>
<p>So yeah, for Rust 2024, let&rsquo;s dare to ask for more.<sup id="fnref:14"><a href="#fn:14" class="footnote-ref" role="doc-noteref">14</a></sup></p>
<h2 id="footnotes">Footnotes</h2>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>One interesting change: I&rsquo;ve been writing more and more code again. This itself is making a big difference in my state of mind, too!&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Die, I tell you! DIE!&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Because it&rsquo;s 2022, get it?&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>I talked to a team that developed some low-level Rust code (what would&rsquo;ve been writte in C++) and they reported experienced <strong>one</strong> crash in 3+ years, which originated in an FFI to a C library. That&rsquo;s just amazing.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>Most commonly, if Rust has an edge of a language like Java, it is because of our stronger concurrency guarantees. But it&rsquo;s not only that. It&rsquo;s also that meeting the required performance bar in other languages often requires one to write code that is &ldquo;rather clever&rdquo;. Rust&rsquo;s higher performance means that one can write simpler code instead, which then has correspondingly fewer bugs.&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p>The survey consistenly has a peak of around 7 out of 10 in terms of how people self-identify their expertise.&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:7">
<p>Shout out to Jack Huey, tirelessly driving that work forward!&#160;<a href="#fnref:7" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:8">
<p>Shout out to Oliver Scherer, tirelessly driving <em>that</em> work forward!&#160;<a href="#fnref:8" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:9">
<p>Armin wrote a recent article, <a href="https://lucumr.pocoo.org/2022/1/30/unsafe-rust/">Unsafe Rust is Too Hard</a>, that gives some real-life examples of the kinds of challenges you can encounter.&#160;<a href="#fnref:9" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:10">
<p>&hellip;besides boilerplate.&#160;<a href="#fnref:10" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:11">
<p><a href="https://www.youtube.com/watch?v=uTc7KCBbVFI">Watch the recording</a> <a href="https://pernos.co/">pernos.co</a> demo that Felix did for the Rustc Reading Club to get a sense for what is possible here!&#160;<a href="#fnref:11" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:12">
<p>It&rsquo;s so much easier when everybody else is wrong.&#160;<a href="#fnref:12" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:13">
<p>I may have gotten a little carried away there.&#160;<a href="#fnref:13" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:14">
<p>Hey, that rhymes! I&rsquo;m a poet, and I didn&rsquo;t even know it!&#160;<a href="#fnref:14" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><title type="html">Panics vs cancellation, part 1</title><link href="https://smallcultfollowing.com/babysteps/blog/2022/01/27/panics-vs-cancellation-part-1/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2022/01/27/panics-vs-cancellation-part-1/</id><published>2022-01-27T00:00:00+00:00</published><updated>2022-01-27T15:55:00-05:00</updated><content type="html"><![CDATA[<p>One of the things people often complain about when doing Async Rust is cancellation. This has always been a bit confusing to me, because it seems to me that async cancellation should feel a lot like panics in practice, and people don&rsquo;t complain about panics very often (though they do sometimes). This post is the start of a short series comparing panics and cancellation, seeking after the answer to the question &ldquo;Why is async cancellation a pain point and what should we do about it?&rdquo; This post focuses on explaining Rust&rsquo;s <em>panic philosophy</em> and explaining why I see panics and cancellation as being quite analogous to one another.</p>
<h2 id="why-panics-are-discouraged-in-rust">Why panics are discouraged in Rust</h2>
<p>Let&rsquo;s go back to some pre-history. The Rust design has always included panics, but it <em>hasn&rsquo;t</em> always included the <a href="https://doc.rust-lang.org/std/panic/fn.catch_unwind.html"><code>catch_unwind</code></a> function. In fact, adding that function was quite controversial. Why?</p>
<p>The reason is that long experience with exceptions has shown that exceptions work really well for propagating errors out, but they don&rsquo;t work well for recovering from errors or handling them in a structured way. The problem is that exceptions make errors invisible, which means that programmers don&rsquo;t think about them.</p>
<p>The only time when exceptions work well for recovery is when that recovery is done at a very coarse-grained level. If you have a &ldquo;main loop&rdquo; of your application and you can kind of catch the exception and restart that main loop, that can be very useful. You see this insight popping up all over the place; I think Erlang did it best, with their <a href="https://medium.com/@vamsimokari/erlang-let-it-crash-philosophy-53486d2a6da">&ldquo;let it crash&rdquo; philosophy</a>.</p>
<h2 id="why-exceptions-are-bad-at-fine-grained-recovery">Why exceptions are bad at fine-grained recovery</h2>
<p>The reason that exceptions are bad at fine-grained recovery is simple. In most programs, you have some kind of invariants that you are maintaining to ensure your data is in a valid state. It&rsquo;s relatively straightforward to ensure that these invariants hold at the beginning of every operation and that they hold by the end of every operation. It&rsquo;s <strong>really, really hard</strong> to ensure that those invariants hold <strong>all the time</strong>. Very often, you have some code that wants to make some mutations, put your data in an inconsistent state, and then fix that inconsistency.</p>
<p>Unfortunately, with widespread use of exceptions, what you have is that any piece of code, at any time, might suddenly just abort. So if that function is doing mutation, it could leave the program in an inconsistent state.</p>
<p>Consider this simple pseudocode (inspired by <a href="https://tomaka.medium.com/a-look-back-at-asynchronous-rust-d54d63934a1c">tomaka&rsquo;s blog post</a>). The idea of this function is that it is going to read from some file, parse the data it reads, and then send that data over a socket:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">copy_data</span><span class="p">(</span><span class="n">from_file</span>: <span class="kp">&amp;</span><span class="nc">File</span><span class="p">,</span><span class="w"> </span><span class="n">to_socket</span>: <span class="kp">&amp;</span><span class="nc">Socket</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">buffer</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">from_file</span><span class="p">.</span><span class="n">read</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">parsed_items</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">parse</span><span class="p">(</span><span class="n">buffer</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">parsed_items</span><span class="p">.</span><span class="n">send</span><span class="p">(</span><span class="n">to_socket</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>You might think that since this function doesn&rsquo;t do any explicit mutation, it would be fine to stop it any point and re-execute it. But that&rsquo;s not true: there is some implicit state, which is the cursor in the <code>from_file</code>. If the <code>parse</code> function or the <code>send</code> function were to throw an exception, whatever data had just been read (and maybe parsed) would be lost. The next time the function is invoked, it&rsquo;s not going to go back and re-read that data, it&rsquo;s just going to proceed from where it left off, and some data is lost.</p>
<h2 id="rusts-compromise">Rust&rsquo;s compromise</h2>
<p>The initial design of Rust included the idea that panic recovery was only possible at the thread boundary. The idea was that threads own all of their state, so if a thread panicked, you would take down the thread, and with it all of the potentially corrupted state. In this way, recovery could be done with some reasonable assurance of success. There are some limits to this idea. For one thing, threads can share state. The most obvious way for that to happen is with a <code>Mutex</code>, but &ndash; as the <code>copy_data</code> example shows &ndash; you can also have problems when you are communicating (reading from a file, sending messages over a channel, etc).  We have extra mechanisms to help with those cases, such as <a href="https://doc.rust-lang.org/nomicon/poisoning.html">lock posioning</a>, but the jury is out on how well they work.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<h2 id="why--is-good">Why <code>?</code> is good</h2>
<p>All of this discussion of course begs the question, how <em>is</em> one supposed to handle error recovery in Rust? The answer, of course, is <a href="https://doc.rust-lang.org/book/ch09-02-recoverable-errors-with-result.html">the <code>?</code> operator</a>. This operator desugars into a pattern match, but it has the effect of &ldquo;propagating&rdquo; the error to the caller of the function. If we look at the <code>copy_data</code> one more time, but imagine that any potential errors were propagated using results, it would look like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">copy_data</span><span class="p">(</span><span class="n">from_file</span>: <span class="kp">&amp;</span><span class="nc">File</span><span class="p">,</span><span class="w"> </span><span class="n">to_socket</span>: <span class="kp">&amp;</span><span class="nc">Socket</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">eyre</span>::<span class="nb">Result</span><span class="o">&lt;</span><span class="p">()</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">buffer</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">from_file</span><span class="p">.</span><span class="n">read</span><span class="p">()</span><span class="o">?</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">parsed_items</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">parse</span><span class="p">(</span><span class="n">buffer</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">parsed_items</span><span class="p">.</span><span class="n">send</span><span class="p">(</span><span class="n">to_socket</span><span class="p">)</span><span class="o">?</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The nice thing about this code is that one can easily see and audit potential errors: for example, I can see that <code>send</code> may result in an error, and a sharp-eyed reviewer might see the potential data loss.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> Even better, I can do some sort of recovery in the case of error by opting not to forward the error but matching instead. (Note that the <code>send</code> methods <a href="https://doc.rust-lang.org/std/sync/mpsc/struct.Sender.html#method.send">typically pass back the message in the event of an error</a>.)</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">copy_data</span><span class="p">(</span><span class="n">from_file</span>: <span class="kp">&amp;</span><span class="nc">File</span><span class="p">,</span><span class="w"> </span><span class="n">to_socket</span>: <span class="kp">&amp;</span><span class="nc">Socket</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">eyre</span>::<span class="nb">Result</span><span class="o">&lt;</span><span class="p">()</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">buffer</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">from_file</span><span class="p">.</span><span class="n">read</span><span class="p">()</span><span class="o">?</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">parsed_items</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">parse</span><span class="p">(</span><span class="n">buffer</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">match</span><span class="w"> </span><span class="n">parsed_items</span><span class="p">.</span><span class="n">send</span><span class="p">(</span><span class="n">to_socket</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nb">Ok</span><span class="p">(())</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">(),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nb">Err</span><span class="p">(</span><span class="n">SendError</span><span class="p">(</span><span class="n">parsed_items</span><span class="p">))</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">recover_from_error</span><span class="p">(</span><span class="n">parsed_items</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="how-does-this-connect-to-async-cancellation">How does this connect to async cancellation?</h2>
<p>I said that, from a user&rsquo;s perspective, it seems to me that async cancellation and Rust panics should feel very similar. Let me explain.</p>
<p>It sometimes happen that you have spawned a future whose result is no longer needed. For example, you may be running a server that is doing work on behalf of a client, but that client may drop its connection, in which case you&rsquo;d like to cancel that work.</p>
<p>In Rust, our cancellation story is centered around dropping. The idea is that to cancel a future, you drop it. Whenever you drop any kind of value in Rust, the value&rsquo;s destructor runs which has the job of disposing of whatever resources that value owns. In the case of a <em>future</em>, the values that it owns are the suspended variables from the stack frame. Consider that same <code>copy_data</code> function we saw earlier, but ported to async Rust:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">copy_data</span><span class="p">(</span><span class="n">from_file</span>: <span class="kp">&amp;</span><span class="nc">File</span><span class="p">,</span><span class="w"> </span><span class="n">to_socket</span>: <span class="kp">&amp;</span><span class="nc">Socket</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">buffer</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">from_file</span><span class="p">.</span><span class="n">read</span><span class="p">().</span><span class="k">await</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">parsed_items</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">parse</span><span class="p">(</span><span class="n">buffer</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">parsed_items</span><span class="p">.</span><span class="n">send</span><span class="p">(</span><span class="n">to_socket</span><span class="p">).</span><span class="k">await</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Suppose that, at some point, we pause the program at the final line, <code>parsed_items.send(...).await</code>. In that case, the future would be storing the value of <code>buffer</code> and <code>parsed_items</code>. So when the future is dropped, those values will be dropped.</p>
<p>In effect, if you look at things from the &ldquo;inside view&rdquo; of the async fn, cancellation looks like the <code>await</code> call panicking &ndash; it unwinds the stack, running the destructors for all values. The analogy, of course, only goes so far: you can&rsquo;t, for example, &ldquo;catch&rdquo; the unwinding from a cancellation. Also, panics arise from code that the thread executed, but cancellations are injected from the outside when the async fn&rsquo;s result is no longer needed.<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></p>
<h2 id="next-time">Next time</h2>
<p>In the next post I plan to start looking at examples of async cancellation and practice, trying to pinpoint how it is used and why it seems to cause more problems than panic.</p>
<h2 id="thanks">Thanks</h2>
<p>Thanks to Aaron Turon, Yoshua Wuyts, Yehuda Katz, and others with whom I&rsquo;ve deep dived on this topic over the years, and to tomaka for their <a href="https://tomaka.medium.com/a-look-back-at-asynchronous-rust-d54d63934a1c">blog post</a>.</p>
<h2 id="footnotes">Footnotes</h2>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>My take is that the concept behind lock poisoning still seems good to me, but the ergonomics of how we implemented it are bad, and make people not like it. That said, I&rsquo;d like to dig more into this: I&rsquo;ve been hearing from various people that &ndash; even in their limited form &ndash; panics are one of the weaker points in Rust&rsquo;s reliability story, and I&rsquo;m not yet sure what to think.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>My experience is that these bugs are hard to spot in review, but that the <code>?</code> operator is invaluable when debugging &ndash; in that case, you are asking the question, &ldquo;how could this function possibly return early?&rdquo;, and having the <code>?</code> operator really helps you find the answer.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>This could be a crucial difference: I think, for example, it&rsquo;s the reason that Java deprecated its <a href="https://docs.oracle.com/javase/8/docs/api/java/lang/Thread.html#stop--">Thread.stop</a> method.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><title type="html">Dyn async traits, part 7: a design emerges?</title><link href="https://smallcultfollowing.com/babysteps/blog/2022/01/07/dyn-async-traits-part-7/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2022/01/07/dyn-async-traits-part-7/</id><published>2022-01-07T00:00:00+00:00</published><updated>2022-01-07T19:37:00-05:00</updated><content type="html"><![CDATA[<p>Hi all! Welcome to 2022! Towards the end of last year, Tyler Mandry and I were doing a lot of iteration around supporting &ldquo;dyn async trait&rdquo; &ndash; i.e., making traits that use <code>async fn</code> dyn safe &ndash; and we&rsquo;re starting to feel pretty good about our design. This is the start of several blog posts talking about where we&rsquo;re at. In this first post, I&rsquo;m going to reiterate our goals and give a high-level outline of the design. The next few posts will dive more into the details and the next steps.</p>
<h2 id="the-goal-traits-with-async-fn-that-work-just-like-normal">The goal: traits with async fn that work &ldquo;just like normal&rdquo;</h2>
<p>It&rsquo;s been a while since my last post about dyn trait, so let&rsquo;s start by reviewing the overall goal: <strong>our mission is to allow <code>async fn</code> to be used in traits just like <code>fn</code></strong>. For example, we would like to have an async version of the <code>Iterator</code> trait that looks roughly like this<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">AsyncIterator</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>You should be able to use this <code>AsyncIterator</code> trait in all the ways you would use any other trait. Naturally, static dispatch and <code>impl Trait</code> should work:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">sum_static</span><span class="p">(</span><span class="k">mut</span><span class="w"> </span><span class="n">v</span>: <span class="nc">impl</span><span class="w"> </span><span class="n">AsyncIterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kt">u32</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">u32</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">result</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">while</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">i</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">v</span><span class="p">.</span><span class="n">next</span><span class="p">().</span><span class="k">await</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">result</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="n">i</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">result</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>But dynamic dispatch should work too:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">sum_dyn</span><span class="p">(</span><span class="n">v</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="k">dyn</span><span class="w"> </span><span class="n">AsyncIterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kt">u32</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">u32</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//               ^^^
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">result</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">while</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">i</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">v</span><span class="p">.</span><span class="n">next</span><span class="p">().</span><span class="k">await</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">result</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="n">i</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">result</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="another-goal-leave-dyn-cleaner-than-we-found-it">Another goal: leave dyn cleaner than we found it</h2>
<p>While we started out with the goal of improving <code>async fn</code>, we&rsquo;ve also had a general interest in making <code>dyn Trait</code> more usable overall. There are a few reasons for this. To start, <code>async fn</code> is itself just sugar for a function that returns <code>impl Trait</code>, so making <code>async fn</code> in traits work is equivalent to making <a href="https://rust-lang.github.io/impl-trait-initiative/explainer/rpit.html">RPITIT</a> (&ldquo;return position impl trait in traits&rdquo;) work. But also, the existing <code>dyn Trait</code> design contains a number of limitations that can be pretty frustrating, and so we would like a design that improves as many of those as possible. Currently, our plan lifts the following limitations, so that traits which make use of these features would still be compatible with <code>dyn</code>:</p>
<ul>
<li>Return position <code>impl Trait</code>, so long as <code>Trait</code> is dyn safe.
<ul>
<li>e.g., <code>fn get_widgets(&amp;self) -&gt; impl Iterator&lt;Item = Widget&gt;</code></li>
<li>As discussed above, this means that <code>async fn</code> works, since it desugars</li>
</ul>
</li>
<li>Argument position <code>impl Trait</code>, so long as <code>Trait</code> is dyn safe.
<ul>
<li>e.g., <code>fn process_widgets(&amp;mut self, items: impl Iterator&lt;Item = Widget&gt;)</code>.</li>
</ul>
</li>
<li>By-value self methods.
<ul>
<li>e.g., given <code>fn process(self)</code> and <code>d: Box&lt;dyn Trait&gt;</code>, able to call <code>d.process()</code></li>
<li>eventually this would be extended to other &ldquo;box-like&rdquo; smart pointers</li>
</ul>
</li>
</ul>
<p>If you put all three of those together, it represents a pretty large expansion to what dyn safety feels like in Rust. Here is an example trait that would now be dyn safe that uses all of these things together in a natural way:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Widget</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">augment</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">component</span>: <span class="nc">impl</span><span class="w"> </span><span class="nb">Into</span><span class="o">&lt;</span><span class="n">WidgetComponent</span><span class="o">&gt;</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">components</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">impl</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">WidgetComponent</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">transmit</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">factory</span>: <span class="nc">impl</span><span class="w"> </span><span class="n">Factory</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="final-goal-works-without-an-allocator-too-though-you-have-to-work-a-bit-harder">Final goal: works without an allocator, too, though you have to work a bit harder</h2>
<p>The most straightforward way to support <a href="https://rust-lang.github.io/impl-trait-initiative/explainer/rpit.html">RPITIT</a> is to allocate a <code>Box</code> to store the return value. Most of the time, this is just fine. But there are use-cases where it&rsquo;s not a good choice:</p>
<ul>
<li>In a kernel, where you would like to use a custom allocator.</li>
<li>In a tight loop, where the performance cost of an allocation is too high.</li>
<li>Extreme embedded cases, where you have no allocator at all.</li>
</ul>
<p>Therefore, we would like to ensure that it is possible to use a trait that uses async fns or RPITIT without requiring an allocator, though we think it&rsquo;s ok for that to require a bit more work. Here are some alternative strategies one might want to support:</p>
<ul>
<li>Pre-allocating stack space: when you create the <code>dyn Trait</code>, you reserve some space on the stack to store any futures or <code>impl Trait</code> that it might return.</li>
<li>Caching: reuse the same <code>Box</code> over and over to reduce the performance impact (a good allocator would do this for you, but not all systems ship with efficient allocators).</li>
<li>Sealed trait: you derive a wrapper enum for just the types that you need.</li>
</ul>
<p>Ultimately, though, there is no limit to the number of ways that one might manage dynamic dispatch, so the goal is not to have a &ldquo;built-in&rdquo; set of strategies but rather allow people to develop their own using procedural macros. We can then offer the most common strategies in utility crates or perhaps even in the stdlib, while also allowing people to develop their own if they have very particular needs.</p>
<h2 id="the-design-from-22222-feet">The design from 22,222 feet</h2>
<p>I&rsquo;ve drawn a little diagram to illustrate how our design works at a high-level:</p>
<p><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" width="431px" viewBox="-0.5 -0.5 431 731" content="&lt;mxfile host=&quot;app.diagrams.net&quot; modified=&quot;2022-01-08T00:40:35.768Z&quot; agent=&quot;5.0 (Macintosh)&quot; etag=&quot;ByO0M6C-FHR3Zr7-2PQV&quot; version=&quot;16.1.2&quot; type=&quot;google&quot;&gt;&lt;diagram id=&quot;DfosBROBM4uyRER-NLTF&quot; name=&quot;Page-1&quot;&gt;7VrZcpswFP0aP7bDYrD9GDtJO9O96bTNowwC1AhEhYjtfn0lEDsY0oESd/qSSEdCy7mr7nih7/zjKwpC7x2xIV5oin1c6NcLTVsvVf5XAKcUWOmrFHApslNILYA79AtKUJFojGwYVSYyQjBDYRW0SBBAi1UwQCk5VKc5BFd3DYELG8CdBXAT/YZs5slrGUqBv4bI9bKdVUWO+CCbLIHIAzY5lCD9ZqHvKCEsbfnHHcSCu4yX9LvbjtH8YBQGbMgHj5+Ce3h/89m4Xb35oG7CK2xvXmhmuswjwLG8sTwtO2UUUBIHNhSrKAt9e/AQg3chsMTogcucYx7zMe+pvOkgjHcEE5p8q0PVNuCK4xGj5AGWRjbmSgcmH5EHgJTBY+fV1JwwrmiQ+JDRE58iP9AlxadcCGn/UEhMXUvMK0nLyGQDpJa4+dIFkbwhuXwKr8a0vDprC1pWG6/7tbE0lHF41dbPjlhVb/AIbW6xskso84hLAoBvCnRbZbqY85aQUPL7AzJ2ku4HxIxU2ed80dN3+X3SuRedl0bWvT6WB69PspeeVRzwvAT4fUhMLXjm4kvp/AB1ITtHUIdIKcSAocfqQUYXz7JF7U3Mz7tFvOGKxlcG9lwqEua75CMNyXKXGYpmSIkFo6jfSvbAenATaX+IGUYBlPgYtqBUbUEzm7awaTEFcypL0Jse5S9YgkMCdgt8hAUNXyG1QQDGVXRjoKJrmzkVvenfdwBjSDt12CZW7Cdk9CpxKqO3ezye9tYdud5U3jY/vp7Mja/mUN4RtVTNktFef2zMqabZMVscso0eC8db89FX1E2VtdVL51jLEudWBTYIGb80CbKhPW1ZdeBO5WnAFzYU7CPxz6HEF1euxJm+nTpzM7XfYMfIYLX+6NJmoJNFl5Z89cIMdGgcUc1ZDbTtoVBT8PeE+kCs5MSBlZoPf8hyqsXiosM8KJp+iP9Q3XsUHERh+rZ20FGIdwqN1+cOSdq8L4u889dfFqo5NOOa9WmhtpUqapbyGbKYBvkkfopSyFE49xceFpba3GFBaRHC+BkvJVJo+vULUWCYIgc2BnI5ncNZXniI1YbmwKV65wyOQ+vOgZuOg53Cht+4xIRy/dwcx6pTCElurwCM3CAZMH/Goia+xdBhRa/O9j4Dsne38hAk1f6rkjz2LTLqerQ8+QQxriM4l/8XrkYRX5M4eWKmfKEAMbGPfFxFL0unwm2vnuYOY56/yaBNkkMHhEk6/4jNPl4yWtKs1RKbw/NUFMhHCi0UwcR3VCx2J3h2qjlyOiWqSeBJrNesmNsjq5pqtf4eEFFyrBTrJZQJJxGJvhW2jfjVryTsI9tO/HubZ6j6/DHSiqpzMJcDy/p5DXT8YuYsyff0xcy0SDkgRC5njZCb/ghZlIQuPaleP7NSi95dAhjDscPRHPv/MJmyWU80WvX9f7D8R4Jl3V9MGi15t/hFSDJW+lmNfvMb&lt;/diagram&gt;&lt;/mxfile&gt;" onclick="(function(svg){var src=window.event.target||window.event.srcElement;while (src!=null&amp;&amp;src.nodeName.toLowerCase()!='a'){src=src.parentNode;}if(src==null){if(svg.wnd!=null&amp;&amp;!svg.wnd.closed){svg.wnd.focus();}else{var r=function(evt){if(evt.data=='ready'&amp;&amp;evt.source==svg.wnd){svg.wnd.postMessage(decodeURIComponent(svg.getAttribute('content')),'*');window.removeEventListener('message',r);}};window.addEventListener('message',r);svg.wnd=window.open('https://viewer.diagrams.net/?client=1&amp;page=0&amp;edit=_blank');}}})(this);" style="cursor:pointer;max-width:100%;max-height:731px;"><defs/><g><rect x="0" y="0" width="180" height="520" fill="#e1d5e7" stroke="#9673a6" pointer-events="all"/><rect x="250" y="0" width="180" height="520" fill="#f8cecc" stroke="#b85450" pointer-events="all"/><path d="M 260 180 L 280 180 L 270 180 L 283.63 180" fill="none" stroke="rgb(0, 0, 0)" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 288.88 180 L 281.88 183.5 L 283.63 180 L 281.88 176.5 Z" fill="rgb(0, 0, 0)" stroke="rgb(0, 0, 0)" stroke-miterlimit="10" pointer-events="all"/><rect x="170" y="150" width="90" height="60" fill="rgb(255, 255, 255)" stroke="rgb(0, 0, 0)" pointer-events="all"/><path d="M 179 150 L 179 210 M 251 150 L 251 210" fill="none" stroke="rgb(0, 0, 0)" stroke-miterlimit="10" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 70px; height: 1px; padding-top: 180px; margin-left: 180px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgb(0, 0, 0); "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; white-space: normal; overflow-wrap: normal;"><i>Vtable</i></div></div></div></foreignObject><text x="215" y="184" fill="rgb(0, 0, 0)" font-family="Helvetica" font-size="12px" text-anchor="middle">Vtable</text></switch></g><path d="M 90 100 L 90 143.63" fill="none" stroke="rgb(0, 0, 0)" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 90 148.88 L 86.5 141.88 L 90 143.63 L 93.5 141.88 Z" fill="rgb(0, 0, 0)" stroke="rgb(0, 0, 0)" stroke-miterlimit="10" pointer-events="all"/><path d="M 50 20 L 130 20 L 130 88 Q 110 66.4 90 88 Q 70 109.6 50 88 L 50 32 Z" fill="rgb(255, 255, 255)" stroke="rgb(0, 0, 0)" stroke-miterlimit="10" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 78px; height: 1px; padding-top: 48px; margin-left: 51px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgb(0, 0, 0); "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; white-space: normal; overflow-wrap: normal;">Caller</div></div></div></foreignObject><text x="90" y="52" fill="rgb(0, 0, 0)" font-family="Helvetica" font-size="12px" text-anchor="middle">Caller</text></switch></g><path d="M 330 210 L 330 230 L 330 200 L 330 213.63" fill="none" stroke="rgb(0, 0, 0)" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 330 218.88 L 326.5 211.88 L 330 213.63 L 333.5 211.88 Z" fill="rgb(0, 0, 0)" stroke="rgb(0, 0, 0)" stroke-miterlimit="10" pointer-events="all"/><rect x="290" y="150" width="80" height="60" rx="9" ry="9" fill="rgb(255, 255, 255)" stroke="rgb(0, 0, 0)" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 78px; height: 1px; padding-top: 180px; margin-left: 291px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgb(0, 0, 0); "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; white-space: normal; overflow-wrap: normal;"><div><i>Argument</i></div><div><i>adaptation<br /></i></div><i> from vtable<br /></i></div></div></div></foreignObject><text x="330" y="184" fill="rgb(0, 0, 0)" font-family="Helvetica" font-size="12px" text-anchor="middle">Argument&hellip;</text></switch></g><path d="M 330 300 L 330 320 L 330 290 L 330 303.63" fill="none" stroke="rgb(0, 0, 0)" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 330 308.88 L 326.5 301.88 L 330 303.63 L 333.5 301.88 Z" fill="rgb(0, 0, 0)" stroke="rgb(0, 0, 0)" stroke-miterlimit="10" pointer-events="all"/><rect x="290" y="220" width="80" height="80" fill="rgb(255, 255, 255)" stroke="rgb(0, 0, 0)" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 78px; height: 1px; padding-top: 260px; margin-left: 291px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgb(0, 0, 0); "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; white-space: normal; overflow-wrap: normal;"><i>Normal function found in the impl<br /></i></div></div></div></foreignObject><text x="330" y="264" fill="rgb(0, 0, 0)" font-family="Helvetica" font-size="12px" text-anchor="middle">Normal functi&hellip;</text></switch></g><path d="M 290 340 L 136.37 340" fill="none" stroke="rgb(0, 0, 0)" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 131.12 340 L 138.12 336.5 L 136.37 340 L 138.12 343.5 Z" fill="rgb(0, 0, 0)" stroke="rgb(0, 0, 0)" stroke-miterlimit="10" pointer-events="all"/><rect x="290" y="310" width="80" height="60" rx="9" ry="9" fill="rgb(255, 255, 255)" stroke="rgb(0, 0, 0)" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 78px; height: 1px; padding-top: 340px; margin-left: 291px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgb(0, 0, 0); "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; white-space: normal; overflow-wrap: normal;"><i>Return value adaptation to vtable<br /></i></div></div></div></foreignObject><text x="330" y="344" fill="rgb(0, 0, 0)" font-family="Helvetica" font-size="12px" text-anchor="middle">Return value&hellip;</text></switch></g><path d="M 50 410 L 130 410 L 130 478 Q 110 456.4 90 478 Q 70 499.6 50 478 L 50 422 Z" fill="rgb(255, 255, 255)" stroke="rgb(0, 0, 0)" stroke-miterlimit="10" transform="rotate(-180,90,450)" pointer-events="all"/><path d="M 90 370 L 90 403.63" fill="none" stroke="rgb(0, 0, 0)" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 90 408.88 L 86.5 401.88 L 90 403.63 L 93.5 401.88 Z" fill="rgb(0, 0, 0)" stroke="rgb(0, 0, 0)" stroke-miterlimit="10" pointer-events="all"/><rect x="50" y="310" width="80" height="60" rx="9" ry="9" fill="rgb(255, 255, 255)" stroke="rgb(0, 0, 0)" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 78px; height: 1px; padding-top: 340px; margin-left: 51px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgb(0, 0, 0); "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; white-space: normal; overflow-wrap: normal;"><i>Return type adaptation from vtable<br /></i></div></div></div></foreignObject><text x="90" y="344" fill="rgb(0, 0, 0)" font-family="Helvetica" font-size="12px" text-anchor="middle">Return type a&hellip;</text></switch></g><rect x="0" y="530" width="180" height="200" fill="none" stroke="none" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe flex-start; width: 178px; height: 1px; padding-top: 630px; margin-left: 2px;"><div style="box-sizing: border-box; font-size: 0px; text-align: left;" data-drawio-colors="color: rgb(0, 0, 0); "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; white-space: normal; overflow-wrap: normal;"><div align="left"><b>Caller knows:</b></div><div align="left"><ul><li>Types of impl Trait arguments.</li></ul></div><div align="left"><b>Caller does not know:</b></div><ul><li>Type of the callee.</li><li>Precise return type, if function returns impl Trait.</li></ul></div></div></div></foreignObject><text x="2" y="634" fill="rgb(0, 0, 0)" font-family="Helvetica" font-size="12px">Caller knows:&hellip;</text></switch></g><path d="M 130 180 L 163.63 180" fill="none" stroke="rgb(0, 0, 0)" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 168.88 180 L 161.88 183.5 L 163.63 180 L 161.88 176.5 Z" fill="rgb(0, 0, 0)" stroke="rgb(0, 0, 0)" stroke-miterlimit="10" pointer-events="all"/><rect x="50" y="150" width="80" height="60" rx="9" ry="9" fill="rgb(255, 255, 255)" stroke="rgb(0, 0, 0)" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 78px; height: 1px; padding-top: 180px; margin-left: 51px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgb(0, 0, 0); "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; white-space: normal; overflow-wrap: normal;"><i>Argument adaptation to vtable<br /></i></div></div></div></foreignObject><text x="90" y="184" fill="rgb(0, 0, 0)" font-family="Helvetica" font-size="12px" text-anchor="middle">Argument adap&hellip;</text></switch></g><rect x="250" y="530" width="180" height="200" fill="none" stroke="none" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe flex-start; width: 178px; height: 1px; padding-top: 630px; margin-left: 252px;"><div style="box-sizing: border-box; font-size: 0px; text-align: left;" data-drawio-colors="color: rgb(0, 0, 0); "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; white-space: normal; overflow-wrap: normal;"><div align="left"><b>Callee does not know:</b></div><div align="left"><ul><li>Types of impl Trait arguments.</li></ul></div><div align="left"><b>Callee knows:<br /></b></div><ul><li>Type of the callee.</li><li>Precise return type, if function returns impl Trait.</li></ul></div></div></div></foreignObject><text x="252" y="634" fill="rgb(0, 0, 0)" font-family="Helvetica" font-size="12px">Callee does not know:&hellip;</text></switch></g></g><switch><g requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"/><a transform="translate(0,-5)" xlink:href="https://www.diagrams.net/doc/faq/svg-export-text-problems" target="_blank"><text text-anchor="middle" font-size="10px" x="50%" y="100%">Viewer does not support full SVG 1.1</text></a></switch></svg></p>
<p>Let&rsquo;s walk through it:</p>
<ol>
<li>To start, we have the caller, which has access to some kind of <code>dyn</code> trait, such as <code>w: &amp;mut Widget</code>, and wishes to call a method, like <code>w.augment()</code></li>
<li>The caller looks up the function for <code>augment</code> in the vtable and calls it:
<ul>
<li>But wait, augment takes a <code>impl Into&lt;WidgetComponent&gt;</code>, which means that it is a generic function. Normally, we would have a separate copy of this function for every <code>Into</code> type! But we must have only a single copy for the vtable! What do we do?</li>
<li>The answer is that the vtable encodes a copy that expects &ldquo;some kind of pointer to a <code>dyn Into&lt;WidgetComponent&gt;</code>&rdquo;. This could be a <code>Box</code> but it could also be other kinds of pointers: I&rsquo;m being hand-wavy for now, I&rsquo;ll go into the details later.</li>
<li>The caller therefore has the job of creating a &ldquo;pointer to a <code>dyn Into&lt;WidgetComponent&gt;</code>&rdquo;. It can do this because it knows the type of the value being provided; in this case, it would do it by allocating some memory space on the stack.</li>
</ul>
</li>
<li>The vtable, meanwhile, includes a pointer to the right function to call. But it&rsquo;s not a direct pointer to the function from the impl: it&rsquo;s a lightweight shim that wraps that function. This shim has the job of converting <em>from</em> the vtable&rsquo;s ABI into the standard ABI used for static dispatch.</li>
<li>When the function returns, meanwhile, it is giving back some kind of future. The callee knows that type, but the caller doesn&rsquo;t. Therefore, the callee has the job of converting it to &ldquo;some kind of pointer to a <code>dyn Future</code>&rdquo; and returning that pointer to the caller.
<ul>
<li>The default is to box it, but the callee can customize this to use other strategies.</li>
</ul>
</li>
<li>The caller gets back its &ldquo;pointer to a <code>dyn Future</code>&rdquo; and is able to await that, even though it doesn&rsquo;t know exactly what sort of future it is.</li>
</ol>
<h2 id="upcoming-posts">Upcoming posts</h2>
<p>In upcoming blog posts, I&rsquo;m going to expand on several things that I alluded to in my walkthrough:</p>
<ul>
<li>&ldquo;Pointer to a <code>dyn Trait</code>&rdquo;:
<ul>
<li>How exactly do we encode &ldquo;some kind of pointer&rdquo; and what does that mean?</li>
<li>This is really key, because we need to be able to support</li>
</ul>
</li>
<li>Adaptation for <code>impl Trait</code> arguments:
<ul>
<li>How do we adapt to/from the vtable for arguments of generic type?</li>
<li>Hint: it involves create a <code>dyn Trait</code> for the argument</li>
</ul>
</li>
<li>Adaptation for impl trait return values:
<ul>
<li>How do we adapt to/from the vtable for arguments of generic type?</li>
<li>Hint: it involves returning a <code>dyn Trait</code>, potentially boxed but not necessarily</li>
</ul>
</li>
<li>Adaptation for by-value self:
<ul>
<li>How do we adapt to/from the vtable for by-value self, and when are such functions callable?</li>
</ul>
</li>
<li>Boxing and alternatives thereto:
<ul>
<li>When you call an async fn or fn that returns <code>impl Trait</code> via dynamic dispatch, the default behavior is going to allocate a <code>Box</code>, but we&rsquo;ve seen that doesn&rsquo;t work for everyone. How convenient can we make it to select an alternative strategy like stack pre-allocation, and how can people create their own strategies?</li>
</ul>
</li>
</ul>
<p>We&rsquo;ll also be updating the <a href="https://rust-lang.github.io/async-fundamentals-initiative/">async fundamentals initiative</a> page with more detailed design docs.</p>
<h2 id="appendix-things-id-still-like-to-see">Appendix: Things I&rsquo;d still like to see</h2>
<p>I&rsquo;m pretty excited about where we&rsquo;re landing in this round of work, but it doesn&rsquo;t get <code>dyn</code> where I ultimately want it to be. My ultimate goal is that people are able to use dynamic dispatch as conveniently as you use <code>impl Trait</code>, but I&rsquo;m not entirely sure how to get there. That means being able to write function signatures that don&rsquo;t talk about <code>Box</code> vs <code>&amp;</code> or other details that you don&rsquo;t have to deal with when you talk about <code>impl Trait</code>. It also means not having to worry so much about <code>Send/Sync</code> and lifetimes.</p>
<p>Here are some of the improvements I would like to see, if we can figure out how:</p>
<ul>
<li>Support clone:
<ul>
<li>Given trait <code>Widget: Clone</code> and <code>w: Box&lt;dyn Widget&gt;</code>, able to invoke <code>w.clone()</code></li>
<li>This <em>almost</em> works, but the fact that <code>trait Clone: Sized</code> makes it difficult.</li>
</ul>
</li>
<li>Support &ldquo;partially dyn safe&rdquo; traits:
<ul>
<li>Right now, dyn safe is all or nothing. This has the nice implication that <code>dyn Foo: Foo</code> for all types. However, it is also limiting, and many people have told me they find it confusing. Moreover, <code>dyn Foo</code> is not <code>Sized</code>, and hence while it&rsquo;s cool conceptually that <code>dyn Foo</code> implements <code>Foo</code>, you can&rsquo;t actually <em>use</em> a <code>dyn Foo</code> in the same way that you would use most other types.</li>
</ul>
</li>
<li>Improve how <code>Send</code> interacts with returned values (e.g., RPIT, async fn in traits, etc):
<ul>
<li>If you write <code>dyn Foo + Send</code>, that</li>
</ul>
</li>
<li>Avoid having to talk about pointers so much
<ul>
<li>When you use <code>impl Trait</code>, you get a really ergonomic experience today:
<ul>
<li><code>fn apply_map(map_fn: impl FnMut(u32) -&gt; u32)</code></li>
<li><code>fn items(&amp;self) -&gt; impl Iterator&lt;Item = Item&gt; + '_</code></li>
</ul>
</li>
<li>In contrast, when you use dyn trait, you wind up having to be very explicit around lots of details, and your callers have to change as well:
<ul>
<li><code>fn apply_map(map_fn: &amp;mut dyn FnMut(u32) -&gt; u32)</code></li>
<li><code>fn items(&amp;self) -&gt; Box&lt;dyn Iterator&lt;Item = Item&gt; + '_&gt;</code></li>
</ul>
</li>
</ul>
</li>
<li>Make dyn trait feel more parametric:
<ul>
<li>If I have an <code>struct Foo&lt;T: Trait&gt; { t: Box&lt;T&gt; }</code>, it has the nice property that it exposes the <code>T</code>. This means we know that <code>Foo&lt;T&gt;: Send</code> if <code>T: Send</code> (assuming <code>Foo</code> doesn&rsquo;t have any fields that are not send), we know that <code>Foo&lt;T&gt;: 'static</code> if <code>T: 'static</code>, and so forth. This is very cool.</li>
<li>In contrast, <code>struct Foo { t: Box&lt;dyn Trait&gt; }</code> bakes a lot of details &ndash; it doesn&rsquo;t permit <code>t</code> to contain any references, and it doesn&rsquo;t let <code>Foo</code> be <code>Send</code>.</li>
</ul>
</li>
<li>Make it sound:
<ul>
<li>There are a few open soundness bugs around dyn trait, such as <a href="https://github.com/rust-lang/rust/issues/57893">#57893</a>, and I would like to close them. This interacts with other things in this list.</li>
</ul>
</li>
</ul>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>This has traditionally been called <code>Stream</code>.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/dyn-async-traits" term="dyn-async-traits" label="Dyn async traits"/></entry><entry><title type="html">Rustc Reading Club, Take 2</title><link href="https://smallcultfollowing.com/babysteps/blog/2021/11/18/rustc-reading-club-take-2/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2021/11/18/rustc-reading-club-take-2/</id><published>2021-11-18T00:00:00+00:00</published><updated>2021-11-18T12:49:00-05:00</updated><content type="html"><![CDATA[<p><img src="https://miro.medium.com/max/850/1*T__f3-PmPA5TDDoPW_uX3A.png" width="222" style="float:left;"/> Wow! The response to the last Rustc Reading Club was overwhelming &ndash; literally! We maxed out the number of potential zoom attendees and I couldn&rsquo;t even join the call! It&rsquo;s clear that there&rsquo;s a lot of demand here, which is great. We&rsquo;ve decided to take another stab at running the Rustc Reading Club, but we&rsquo;re going to try it a bit differently this time. We&rsquo;re going to start by selecting a smaller group to do it a few times and see how it goes, and then decide how to scale up.</p>
<div style="clear:both;"></div>
<h2 id="the-ask">The ask</h2>
<p>Here is what we want from you. If you are interested in the Rustc Reading Club, fill sign up on the form below!</p>
<p><a href="https://docs.google.com/forms/d/1ffwJnGsQaY5-8TCtYMFtlc2Hhnn_vh_w_UWwPrBDHUM">Rustc reading club signup form</a></p>
<h2 id="start-small">Start small&hellip;</h2>
<p>As Doc Jones announced in <a href="https://mojosd.medium.com/the-second-first-rustc-reading-club-d0d0ffedc92f">her post</a>, we&rsquo;re going to hold our second meeting on December 2, 2021 at 12PM EST (<a href="https://everytimezone.com/s/d2a61447">see in your timezone</a>). Read <a href="https://mojosd.medium.com/the-second-first-rustc-reading-club-d0d0ffedc92f">her post</a> for all the details on how that&rsquo;s going to work! To avoid a repeat of last time, <strong>this meeting will be invite only</strong> &ndash; we&rsquo;re going to &ldquo;hand select&rdquo; about 10-15 people from the folks who sign up, looking for a range of experience and interests. The reason for this is that we want to try out the idea with a smaller group and see how it goes.</p>
<h2 id="and-scale">&hellip;and scale!</h2>
<p>Presuming the club is a success, we would love to have more active clubs going on. My expectation is that we will have a number of rustc reading clubs of different kinds and flavors &ndash; for example, a recorded club, or a club that is held on Zulip instead of Zoom, or clubs in other languages.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> As we try out new ideas, we&rsquo;ll make sure to reach out to people who <a href="https://docs.google.com/forms/d/1ffwJnGsQaY5-8TCtYMFtlc2Hhnn_vh_w_UWwPrBDHUM">signed up on the google form</a>, so please do sign up if you are interested!</p>
<hr>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>In fact, if you&rsquo;re <em>really</em> excited, you don&rsquo;t need to wait for us &ndash; just create a zoom room and invite your friends to read some code! Or leave a message in <a href="https://rust-lang.zulipchat.com/#narrow/stream/305296-rustc-reading-club">#rustc-reading-club on zulip</a>, I bet you&rsquo;d find some takers.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><title type="html">CTCFT 2021-11-22 Agenda</title><link href="https://smallcultfollowing.com/babysteps/blog/2021/11/15/ctcft-2021-11-22-agenda/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2021/11/15/ctcft-2021-11-22-agenda/</id><published>2021-11-15T00:00:00+00:00</published><updated>2021-11-15T10:19:00-05:00</updated><content type="html"><![CDATA[<p>The next <a href="https://rust-ctcft.github.io/ctcft/">&ldquo;Cross Team Collaboration Fun Times&rdquo; (CTCFT)</a> meeting will take place next Monday, on 2021-11-22 at <strong>11am US Eastern Time</strong> (<a href="https://everytimezone.com/s/91c9791f">click to see in your time zone</a>). <strong>Note that this is a new time:</strong> we are experimenting with rotating in an earlier time that occurs during the European workday. This post covers the agenda. You’ll find the full details (along with a calendar event, zoom details, etc) <a href="https://rust-ctcft.github.io/ctcft/meetings/2021-11-22.html">on the CTCFT website</a>.</p>
<h2 id="agenda">Agenda</h2>
<p>This meeting we&rsquo;ve invited some of the people working to integrate Rust into the
Linux kernel to come and speak. We&rsquo;ve asked them to give us a feel for how the
integration works and help identify those places where the experience is rough.
The expectation is that we can use this feedback as an input when deciding what
work to pursue and what features to prioritize for stabilization.</p>
<ul>
<li>(5 min) Opening remarks 👋 (<a href="https://github.com/nikomatsakis">nikomatsakis</a>)</li>
<li>(40 min) Rust for Linux (<a href="https://github.com/ojeda">ojeda</a>, <a href="https://github.com/alex">alex</a>, <a href="https://github.com/wedsonaf">wedsonaf</a>)
<ul>
<li>The Rust for Linux project is adding Rust support to the Linux kernel. While
it is still the early days, there are some areas of the Rust language,
library, and tooling where the Rust project might be able to help out - for
instance, via stabilization of features, suggesting ways to tackle
particular problems, and more. This talk will walk through the issues found,
along with examples where applicable.</li>
</ul>
</li>
<li>(5 min) Closing (<a href="https://github.com/nikomatsakis">nikomatsakis</a>)</li>
</ul>
<h2 id="afterwards-social-hour">Afterwards: Social Hour</h2>
<p>After the CTCFT this week, we are going to try an experimental social hour. The hour will be coordinated in the #ctcft stream of the rust-lang Zulip. The idea is to create breakout rooms where people can gather to talk, hack together, or just chill.</p>
]]></content></entry><entry><title type="html">View types for Rust</title><link href="https://smallcultfollowing.com/babysteps/blog/2021/11/05/view-types/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2021/11/05/view-types/</id><published>2021-11-05T00:00:00+00:00</published><updated>2021-11-05T11:37:00-04:00</updated><content type="html"><![CDATA[<p>I wanted to write about an idea that&rsquo;s been kicking around in the back of my mind for some time. I call it <em>view types</em>. The basic idea is to give a way for an <code>&amp;mut</code> or <code>&amp;</code> reference to identify which fields it is actually going to access. The main use case for this is having &ldquo;disjoint&rdquo; methods that don&rsquo;t interfere with one another.</p>
<h3 id="this-is-not-a-proposal-yet">This is not a proposal (yet?)</h3>
<p>To be clear, this isn&rsquo;t an RFC or a proposal, at least not yet. It&rsquo;s some early stage ideas that I wanted to document. I&rsquo;d love to hear reactions and thoughts, as I discuss in the conclusion.</p>
<h3 id="running-example">Running example</h3>
<p>As a running example, consider this struct <code>WonkaShipmentManifest</code>. It combines a vector <code>bars</code> of <code>ChocolateBars</code> and a list <code>golden_tickets</code> of indices for bars that should receive a ticket.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">WonkaShipmentManifest</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">bars</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">ChocolateBar</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">golden_tickets</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="kt">usize</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now suppose we want to iterate over those bars and put them into their packaging. Along the way, we&rsquo;ll insert a golden ticket. To start, we write a little function that checks whether a given bar should receive a golden ticket:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">WonkaShipmentManifest</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">should_insert_ticket</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">index</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">bool</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">golden_tickets</span><span class="p">.</span><span class="n">contains</span><span class="p">(</span><span class="o">&amp;</span><span class="n">index</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Next, we write the loop that iterates over the chocolate bars and prepares them for shipment:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">WonkaShipmentManifest</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">prepare_shipment</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">WrappedChocolateBar</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">result</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="n">bar</span><span class="p">,</span><span class="w"> </span><span class="n">i</span><span class="p">)</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">bars</span><span class="p">.</span><span class="n">into_iter</span><span class="p">().</span><span class="n">zip</span><span class="p">(</span><span class="mi">0</span><span class="o">..</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="kd">let</span><span class="w"> </span><span class="n">opt_ticket</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">should_insert_ticket</span><span class="p">(</span><span class="n">i</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="nb">Some</span><span class="p">(</span><span class="n">GoldenTicket</span>::<span class="n">new</span><span class="p">())</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="nb">None</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">result</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="n">bar</span><span class="p">.</span><span class="n">into_wrapped</span><span class="p">(</span><span class="n">opt_ticket</span><span class="p">));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">result</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Satisfied with <a href="https://play.rust-lang.org/?version=stable&amp;mode=debug&amp;edition=2021&amp;gist=a6622a8e4dc3a47576035b848b2cf3ef">our code</a>, we sit back and fire up the compiler and, wait&hellip; what&rsquo;s this?</p>
<pre tabindex="0"><code>error[E0382]: borrow of partially moved value: `self`
   --&gt; src/lib.rs:16:33
    |
15  |         for (bar, i) in self.bars.into_iter().zip(0..) {
    |                                   ----------- `self.bars` partially moved due to this method call
16  |             let opt_ticket = if self.should_insert_ticket(i) {
    |                                 ^^^^ value borrowed here after partial move
    |
</code></pre><p>Well, the message makes <em>sense</em>, but it&rsquo;s unnecessary! The compiler is concerned because we are borrowing <code>self</code> when we&rsquo;ve already moved out of the field <code>self.bars</code>, but we know that <code>should_insert_ticket</code> is only going to look at <code>self.golden_tickets</code>, and that value is still intact. So there&rsquo;s not a real conflict here.</p>
<p>Still, thinking on it more, you can see why the compiler is complaining. It only looks at one function at a time, so how would it know what fields <code>should_insert_ticket</code> is going to read? And, even if were to look at the body of <code>should_insert_ticket</code>, maybe it&rsquo;s reasonable to give a warning for future-proofing. Without knowing more about our plans here at Wonka Inc., it&rsquo;s reasonable to assume that future code authors may modify <code>should_insert_ticket</code> to look at <code>self.bars</code> or any other field. This is part of the reason that Rust does its analysis on a per-function basis: checking each function independently gives room for other functions to change, so long as they don&rsquo;t change their signature, without disturbing their callers.</p>
<p>What we need, then, is a way for <code>should_insert_ticket</code> to describe to its callers which fields it may use and which ones it won&rsquo;t. Then the caller could permit invoking <code>should_insert_ticket</code> whenever the field <code>self.golden_tickets</code> is accessible, even if other fields are borrowed or have been moved.</p>
<h3 id="an-idea">An idea</h3>
<p>When I&rsquo;ve thought about this problem in the past, I&rsquo;ve usually imagined that the list of &ldquo;fields that may be accessed&rdquo; would be attached to the <em>reference</em>. But that&rsquo;s a bit odd, because a reference type <code>&amp;mut T</code> doesn&rsquo;t itself have an fields. The fields come from <code>T</code>.</p>
<p>So recently I was thinking, what if we had a <em>view</em> type? I&rsquo;ll write it <code>{place1, ..., placeN} T</code> for now. What it means is &ldquo;an instance of <code>T</code>, but where only the paths <code>place1...placeN</code> are accessible&rdquo;. Like other types, view types can be borrowed. In our example, then, <code>&amp;{golden_tickets} WonkaShipmentManifest</code> would describe a reference to <code>WonkaShipmentManifest</code> which only gives access to the <code>golden_tickets</code> field.</p>
<h3 id="creating-a-view">Creating a view</h3>
<p>We could use some syntax like <code>{place1..placeN} expr</code> to create a view type<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>. This would be a <em>place expression</em>, which means that it refers to a specific place in memory. This means that it can be directly borrowed without creating a temporary. So I can create a view onto <code>self</code> that only has access to <code>bars_counter</code> like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">WonkaShipmentManifest</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">example_a</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">self1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="p">{</span><span class="n">golden_tickets</span><span class="p">}</span><span class="w"> </span><span class="bp">self</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;tickets = </span><span class="si">{:#?}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="n">self1</span><span class="p">.</span><span class="n">golden_tickets</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Notice the distinction between <code>&amp;self.golden_tickets</code> and <code>&amp;{golden_tickets} self</code>. The former borrows the field directly. The latter borrows the entire struct, but only gives access to one field. What happens if you try to access another field? An error, of course:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">WonkaShipmentManifest</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">example_b</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">self1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="p">{</span><span class="n">golden_tickets</span><span class="p">}</span><span class="w"> </span><span class="bp">self</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;tickets = </span><span class="si">{:#?}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="n">self1</span><span class="p">.</span><span class="n">golden_tickets</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">for</span><span class="w"> </span><span class="n">bar</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="o">&amp;</span><span class="n">self1</span><span class="p">.</span><span class="n">bars</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//      ^^^^^^^^^^
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// Error: self1 does not have access to `bars`
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Of course, when a view is active, you can still access other fields through the original path, without disturbing the borrow:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">WonkaShipmentManifest</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">example_c</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">self1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="p">{</span><span class="n">golden_tickets</span><span class="p">)</span><span class="w"> </span><span class="bp">self</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">for</span><span class="w"> </span><span class="n">bar</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">bars</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;tickets = </span><span class="si">{:#?}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="n">self1</span><span class="p">.</span><span class="n">golden_tickets</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>And, naturally, that access includes the ability to create multiple views at once, so long as they have disjoint paths:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">WonkaShipmentManifest</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">example_d</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">self1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="p">{</span><span class="n">golden_tickets</span><span class="p">)</span><span class="w"> </span><span class="bp">self</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">self2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="p">{</span><span class="n">bars</span><span class="p">}</span><span class="w"> </span><span class="bp">self</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">for</span><span class="w"> </span><span class="n">bar</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">self2</span><span class="p">.</span><span class="n">bars</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;tickets = </span><span class="si">{:#?}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="n">self1</span><span class="p">.</span><span class="n">golden_tickets</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">bar</span><span class="p">.</span><span class="n">modify</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="view-types-in-methods">View types in methods</h3>
<p>As example C in the previous section suggested, we can use a view type in our definition of <code>should_insert_ticket</code> to specify which fields it will use:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">WonkaChocolateFactory</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">should_insert_ticket</span><span class="p">(</span><span class="o">&amp;</span><span class="p">{</span><span class="n">golden_tickets</span><span class="p">}</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">index</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">bool</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">golden_tickets</span><span class="p">.</span><span class="n">contains</span><span class="p">(</span><span class="o">&amp;</span><span class="n">index</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>As a result of doing this, we can successfully compile the <code>prepare_shipment</code> function:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">WonkaShipmentManifest</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">prepare_shipment</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">WrappedChocolateBar</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">result</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="n">bar</span><span class="p">,</span><span class="w"> </span><span class="n">i</span><span class="p">)</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">bars</span><span class="p">.</span><span class="n">into_iter</span><span class="p">().</span><span class="n">zip</span><span class="p">(</span><span class="mi">0</span><span class="o">..</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//          ^^^^^^^^^^^^^^^^^^^^^
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// Moving out of `self.bars` here....
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="kd">let</span><span class="w"> </span><span class="n">opt_ticket</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">should_insert_ticket</span><span class="p">(</span><span class="n">i</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="c1">//              ^^^^
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="c1">// ...does not conflict with borrowing a
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="c1">// view of `{golden_tickets}` from `self` here.
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="nb">Some</span><span class="p">(</span><span class="n">GoldenTicket</span>::<span class="n">new</span><span class="p">())</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="nb">None</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">result</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="n">bar</span><span class="p">.</span><span class="n">into_wrapped</span><span class="p">(</span><span class="n">opt_ticket</span><span class="p">));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">result</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="view-types-with-access-modes">View types with access modes</h3>
<p>All my examples so far were with &ldquo;shared&rdquo; views through <code>&amp;</code> references. We could of course say that <code>&amp;mut {bars} WonkaShipmentManifest</code> gives mutable access to the field <code>bars</code>, but it might also be nice to have an explicit <code>mut</code> mode, such that you write <code>&amp;mut {mut bars} WonkaShipmentManifest</code>. This is more verbose, but it permits one to give away a mix of &ldquo;shared&rdquo; and &ldquo;mut&rdquo; access:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">WonkaShipmentManifest</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">add_ticket</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="p">{</span><span class="n">bars</span><span class="p">,</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">golden_tickets</span><span class="p">}</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">index</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//              ^^^^  ^^^^^^^^^^^^^^^^^^^
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//              |     mut access to golden-tickets
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//              shared access to bars
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="fm">assert!</span><span class="p">(</span><span class="n">index</span><span class="w"> </span><span class="o">&lt;</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">bars</span><span class="p">.</span><span class="n">len</span><span class="p">());</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">golden_tickets</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="n">index</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>One could invoke <code>add_ticket</code> even if you had existing borrows to <code>bars</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">manifest</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">WonkaShipmentManifest</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">bars</span><span class="p">,</span><span class="w"> </span><span class="n">golden_tickets</span><span class="w"> </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">bar0</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">manifest</span><span class="p">.</span><span class="n">bars</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//         ^^^^^^^^^^^^^^ shared borrow of `manifest.bars`...
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">manifest</span><span class="p">.</span><span class="n">add_ticket</span><span class="p">(</span><span class="mi">22</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//      ^ borrows `self` mutably, but with view
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//        `{bars, mut golden_tickets}`
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;debug: </span><span class="si">{:?}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="n">bar0</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="view-types-and-ownership">View types and ownership</h3>
<p>I&rsquo;ve always shown view types with references, but combining them with ownership makes for other interesting possibilities. For example, suppose I wanted to extend <code>GoldenTicket</code> with some kind of unique <code>serial_number</code> that should never change, along with a <code>owner</code> field that will be mutated over time. For various reasons<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>, I might like to make the fields of <code>GoldenTicket</code> public:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">GoldenTicket</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="n">serial_number</span>: <span class="kt">usize</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="n">owner</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="nb">String</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">GoldenTicket</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">new</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">Self</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">Self</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>However, if I do that, then nothing stops future owners of a <code>GoldenTicket</code> from altering its <code>serial_number</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">t</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">GoldenTicket</span>::<span class="n">new</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">t</span><span class="p">.</span><span class="n">serial_number</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w"> </span><span class="c1">// uh-oh!
</span></span></span></code></pre></div><p>The best answer today is to use a private field and an accessor:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">GoldenTicket</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="n">serial_number</span>: <span class="kt">usize</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="n">owner</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="nb">String</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">GoldenTicket</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">new</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">Self</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">serial_number</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">usize</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">serial_number</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>However, Rust&rsquo;s design kind of discourages accessors. For one thing, the borrow checker doesn&rsquo;t know which fields are used by an accessor, so you have code like this, you will now get annoying errors (this has been the theme of this whole post, of course):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">t</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">GoldenTicket</span>::<span class="n">new</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">n</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">t</span><span class="p">.</span><span class="n">owner</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">compute_new_owner</span><span class="p">(</span><span class="n">n</span><span class="p">,</span><span class="w"> </span><span class="n">t</span><span class="p">.</span><span class="n">serial_number</span><span class="p">());</span><span class="w">
</span></span></span></code></pre></div><p>Furthermore, accessors can be kind of unergonomic, particularly for things that are not copy types. Returning (say) an <code>&amp;T</code> from a <code>get</code> can be super annoying.</p>
<p>Using a view type, we have some interesting other options. I could define a type alias <code>GoldenTicket</code> that is a limited view onto the underlying data:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">type</span> <span class="nc">GoldenTicket</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">{</span><span class="n">serial_number</span><span class="p">,</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">owner</span><span class="p">}</span><span class="w"> </span><span class="n">GoldenTicketData</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">GoldenTicketData</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="n">serial_number</span>: <span class="kt">usize</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="n">owner</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="nb">String</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">dummy</span>: <span class="p">(),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now if my constructor function only ever creates this view, we know that nobody will be able to modify the <code>serial_number</code> for a <code>GoldenTicket</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">GoldenTicket</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">new</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">GoldenTicket</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Obviously, this is not ergonomic to write, but it&rsquo;s interesting that it is possible.</p>
<h3 id="view-types-vs-privacy">View types vs privacy</h3>
<p>As you may have noticed in the previous example, view types interact with traditional privacy in interesting ways. It seems like there may be room for some sort of unification, but the two are also different. Traditional privacy (<code>pub</code> fields and so on) is like a view type in that, if you are outside the module, you can&rsquo;t access private fields. <em>Unlike</em> a view, though, you can call methods on the type that <em>do</em> access those fields. In other words, traditional privacy denies you <em>direct</em> access, but permits <em>intermediated</em> access.</p>
<p>View types, in contrast, are &ldquo;transitive&rdquo; and apply both to direct and intermediated actions. If I have a view <code>{serial_number} GoldenTicketData</code>, I cannot access the <code>owner</code> field at all, even by invoking methods on the type.</p>
<h3 id="longer-places">Longer places</h3>
<p>My examples so far have only shown views onto individual fields, but there is no reason we can&rsquo;t have a view onto an arbitrary place. For example, one could write:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Point</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">x</span>: <span class="kt">u32</span><span class="p">,</span><span class="w"> </span><span class="n">y</span>: <span class="kt">u32</span> <span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Square</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">upper_left</span>: <span class="nc">Point</span><span class="p">,</span><span class="w"> </span><span class="n">lower_right</span>: <span class="nc">Point</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">s</span>: <span class="nc">Square</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Square</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">upper_left</span>: <span class="nc">Point</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">x</span>: <span class="mi">22</span><span class="p">,</span><span class="w"> </span><span class="n">y</span>: <span class="mi">44</span><span class="w"> </span><span class="p">},</span><span class="w"> </span><span class="n">lower_right</span>: <span class="nc">Point</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">x</span>: <span class="mi">66</span><span class="p">,</span><span class="w"> </span><span class="n">y</span>: <span class="mi">88</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">s_x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="p">{</span><span class="n">upper_left</span><span class="p">.</span><span class="n">x</span><span class="p">}</span><span class="w"> </span><span class="n">s</span><span class="p">;</span><span class="w">
</span></span></span></code></pre></div><p>to get a view of type <code>&amp;{upper_left.x} Square</code>. Paths like <code>s.upper_left.y</code> and <code>s.lower_right</code> would then still be mutable and not considered borrowed.</p>
<h3 id="view-types-and-named-groups">View types and named groups</h3>
<p>There is another interaction with view types and privacy: view types name fields, but if you have private fields, you probably don&rsquo;t want people outside your module typing their names, since that would prevent you from renaming them. At the same time, you might like to be able to let users refer to &ldquo;groups of data&rdquo; more abstractly. For example, for a <code>WonkaShipmentManifest</code>, I might like users to know they can iterate the bars <em>and</em> check if they have a golden ticket at once:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">WonkaShipmentManifest</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">should_insert_ticket</span><span class="p">(</span><span class="o">&amp;</span><span class="p">{</span><span class="n">golden_tickets</span><span class="p">}</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">index</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">bool</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">golden_tickets</span><span class="p">.</span><span class="n">contains</span><span class="p">(</span><span class="o">&amp;</span><span class="n">index</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">iter_bars_mut</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="p">{</span><span class="n">bars</span><span class="p">}</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">impl</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">Bar</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">bars</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>But how should we express that to users without having them name fields directly? The obvious extension is to have some kind of &ldquo;logical&rdquo; fields that represent groups of data that can change over time. I don&rsquo;t know how to declare those groups though.</p>
<h3 id="groups-could-be-more-dry">Groups could be more DRY</h3>
<p>Another reason to want named groups is to avoid repeating the names of common sets of fields over and over. It&rsquo;s easy to imagine that there might be a few fields that some cluster of methods all want to access, and that repeating those names will be annoying and make the code harder to edit.</p>
<p>One positive thing from Rust&rsquo;s current restrictions is that it has sometimes encouraged me to factor a single large type into multiple smaller ones, where the smaller ones encapsulate a group of logically related fields that are accessed together.[^ex] On the other hand, I&rsquo;ve also encountered situations where such refactorings feel quite arbitrary &ndash; I have groups of fields that, yes, are accessed together, but which don&rsquo;t form a logical unit on their own.</p>
<p>As an example of both why this sort of refactoring can be good and bad at the same time, I introduced the [<code>cfg</code>] field of the MIR <a href="https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_build/build/struct.Builder.html"><code>Builder</code></a> type to resolve errors where some methods only accessed a subset of fields. On the one hand, the CFG-related data is indeed conceptually distinct from the rest. On the other, the CFG type isn&rsquo;t something you would use independently of the <code>Builder</code> itself, and I don&rsquo;t feel that writing <code>self.cfg.foo</code> instead of <code>self.foo</code> made the code particularly clearer.</p>
<h3 id="view-types-and-fields-in-traits">View types and fields in traits</h3>
<p>Some time back, I had a draft RFC for <a href="https://github.com/nikomatsakis/fields-in-traits-rfc/blob/master/0000-fields-in-traits.md">fields in traits</a>. That RFC was &ldquo;postponed&rdquo; and moved to a repo to iterate, but I have never had the time to invest in bringing it back. It has some obvious overlap with this idea of views, and (iirc) I had at some point considered using &ldquo;fields in traits&rdquo; as the basis for declaring views. I think I rather like this more &ldquo;structural&rdquo; approach, but perhaps traits with fields might be a way to give names to groups of fields that public users can reference. Have to mull on that.</p>
<h3 id="view-types-and-disjoint-closure-capture">View types and disjoint closure capture</h3>
<p>Rust 2021 introduced <a href="https://doc.rust-lang.org/nightly/edition-guide/rust-2021/disjoint-capture-in-closures.html">disjoint closure capture</a>. The idea is that closures capture one reference per <em>path</em> that is referenced, subject to some caveats. One of the things I am very happy with is that this was implemented with virtually no changes to the borrow checker: we basically just tweaked how closures are desugared. Besides saving a bunch of effort on the implementation<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>, this means that the risk of soundness problems is not increased. This strategy does have a downside, however: closures can sometimes get bigger (though we found experimentally that they rarely do in practice, and sometimes get smaller too).</p>
<p>Closures that access two paths like <code>a.foo</code> and <code>a.bar</code> can get bigger because they capture those paths independently, whereas before they have just captured <code>a</code> as a whole. Interestingly, using view types offers us a way to desugar those closures without introducing unsafe code. Closures could capture <code>{foo, bar} a</code> instead of the two fields independently. Neat!</p>
<h3 id="how-does-this-affect-learning">How does this affect learning?</h3>
<p>I&rsquo;m always wary about extending &ldquo;core Rust&rdquo; because I don&rsquo;t want to make Rust harder to learn. However, I also tend to feel that extensions like this one can have the opposite effect: I think that what throws people the <em>most</em> when learning Rust is trying to get a feel for what they can and cannot do. When they hit &ldquo;arbitrary&rdquo; restrictions like &ldquo;cannot say that my helper function only uses a subset of my fields&rdquo;<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup> that can often be the most confusing thing of all, because at first people think that they just don&rsquo;t understand the system. &ldquo;Surely there must be some way to do this!&rdquo;</p>
<p>Going a bit further, one of the other challenges with Rust&rsquo;s borrow checker is that so much of its reasoning is invisible and lacks explicit syntax. There is no way to &ldquo;hand annotate&rdquo; the value of lifetime parameters, for example, so as to explore how they work. Similarly, the borrow checker is currently tracking fine-grained state about which paths are borrowed in your program, but you have no way to <em>talk</em> about that logic explicitly. Adding explicit types may indeed prove <em>helpful</em> for learning.</p>
<h3 id="but-there-must-be-some-risks">But there must be some risks?</h3>
<p>Yes, for sure. One of the best and worst things about Rust is that your public API docs force you to make decisions like &ldquo;do I want <code>&amp;self</code> or <code>&amp;mut self</code> access for this function?&rdquo; It pushes a lot of design up front (raising the risk of <a href="https://en.wikipedia.org/wiki/Cognitive_dimensions_of_notations">premature commitment</a>) and makes things harder to change (more <a href="https://en.wikipedia.org/wiki/Cognitive_dimensions_of_notations">viscous</a>). If it became &ldquo;the norm&rdquo; for people to document fine-grained information about which methods use which groups of fields, I worry that it would create more opportunities for semver-hazards, and also just make the docs harder to read.</p>
<p>On the other side, one of my observations it that <strong>public-facing</strong> types don&rsquo;t want views that often; the main exception is that sometimes it&rsquo;d be nice small accessors (for example, a <code>Vec</code> might like to document that one can read <code>len</code> even when iterating). Most of the time I find myself frustrated with this particular limitation of Rust, it has to do with private helper functions (similar to the initial example). In those cases, I think that the documentation is actually <em>helpful</em>, since it guides people who are reading and helps them know what to expect from the function.</p>
<h3 id="conclusion">Conclusion</h3>
<p>This concludes our tour of &ldquo;view types&rdquo;, a proto-proposal. I hope you enjoyed your ride. Curious to hear what people think! I&rsquo;ve opened an <a href="https://internals.rust-lang.org/t/blog-post-view-types-for-rust/15556">thread on internals</a> for feedback. I&rsquo;d love to know if you feel this would solve problems for you, but also how you think it would affect Rust learning &ndash; not to mention better syntax ideas.</p>
<p>I&rsquo;d also be interested to read about related work. The idea here seems likely to have been invented and re-invented numerous times. What other languages, either in academic or industry, have similar mechanisms? How do they work? Educate me!</p>
<h3 id="footnotes">Footnotes</h3>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Yes, this is ambiguous. Think of it as my way of encouraging you to bikeshed something better.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></li>
<li id="fn:3">
<p>Shout out to the <a href="https://github.com/rust-lang/team/blob/master/teams/wg-rfc-2229.toml">RFC 2229 working group</a> folks, who put in months and months and months of work on this.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>Another example is that there is no way to have a struct that has references to its own fields.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><title type="html">Rustc Reading Club</title><link href="https://smallcultfollowing.com/babysteps/blog/2021/10/28/rustc-reading-club/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2021/10/28/rustc-reading-club/</id><published>2021-10-28T00:00:00+00:00</published><updated>2021-10-28T10:01:00-04:00</updated><content type="html"><![CDATA[<p>Ever wanted to understand how rustc works? Me too! <a href="https://github.com/doc-jones">Doc Jones</a> and I have been talking and we had an idea we wanted to try. Inspired by the very cool <a href="https://code-reading.org/">Code Reading Club</a>, we are launching an experimental <a href="https://github.com/rust-lang/rustc-reading-club">Rustc Reading Club</a>. Doc Jones posted an <a href="https://mojosd.medium.com/rust-code-reading-club-8fe356287049?source=social.tw">announcement on her blog</a>, so go take a look!</p>
<p>The way this club works is pretty simple: every other week, we&rsquo;ll get together for 90 minutes and read some part of rustc (or some project related to rustc), and talk about it. Our goal is to walk away with a high-level understanding of how that code works. For more complex parts of the code, we may wind up spending multiple sessions on the same code.</p>
<p>We may yet tweak this, but the plan is to follow a &ldquo;semi-structured&rdquo; reading process:</p>
<ul>
<li>Identify the modules in the code and their purpose.</li>
<li>Look at the type definitions and try to describe their high-level purpose.</li>
<li>Identify the most important functions and their purpose.</li>
<li>Dig into how a few of those functions are actually implemented.</li>
</ul>
<p>The meetings will <em>not</em> be recorded, but they will be open to anyone. The first meeting of the Rustc Reading Club will be <a href="https://rust-lang/rustc-reading-club/meetings/2021-11-04.html">November 4th, 2021 at 12:00pm US Eastern time</a>. Hope to see you there!</p>
]]></content></entry><entry><title type="html">Dyn async traits, part 6</title><link href="https://smallcultfollowing.com/babysteps/blog/2021/10/15/dyn-async-traits-part-6/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2021/10/15/dyn-async-traits-part-6/</id><published>2021-10-15T00:00:00+00:00</published><updated>2021-10-15T15:57:00-04:00</updated><content type="html"><![CDATA[<p>A quick update to my last post: first, a better way to do what I was trying to do, and second, a sketch of the crate I&rsquo;d like to see for experimental purposes.</p>
<h2 id="an-easier-way-to-roll-our-own-boxed-dyn-traits">An easier way to roll our own boxed dyn traits</h2>
<p>In the previous post I covered how you could create vtables and pair the up with a data pointer to kind of &ldquo;roll your own dyn&rdquo;. After I published the post, though, dtolnay sent me <a href="https://play.rust-lang.org/?version=nightly&amp;mode=debug&amp;edition=2018&amp;gist=adba43d6e056337cd8a297624a296219">this Rust playground link</a> to show me a much better approach, one based on the <a href="https://crates.io/crates/erased-serde">erased-serde</a> crate. The idea is that instead of make a &ldquo;vtable struct&rdquo; with a bunch of fn pointers, we create a &ldquo;shadow trait&rdquo; that reflects the contents of that vtable:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// erased trait:
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">ErasedAsyncIter</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="o">&lt;</span><span class="na">&#39;me</span><span class="o">&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="na">&#39;me</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Pin</span><span class="o">&lt;</span><span class="nb">Box</span><span class="o">&lt;</span><span class="k">dyn</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Option</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="na">&#39;me</span><span class="o">&gt;&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Then the <code>DynAsyncIter</code> struct can just be a boxed form of this trait:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">DynAsyncIter</span><span class="o">&lt;</span><span class="na">&#39;data</span><span class="p">,</span><span class="w"> </span><span class="n">Item</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">pointer</span>: <span class="nb">Box</span><span class="o">&lt;</span><span class="k">dyn</span><span class="w"> </span><span class="n">ErasedAsyncIter</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Item</span><span class="o">&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="na">&#39;data</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>We define the &ldquo;shim functions&rdquo; by implementing <code>ErasedAsyncIter</code> for all <code>T: AsyncIter</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">ErasedAsyncIter</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">T</span>: <span class="nc">AsyncIter</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">T</span>::<span class="n">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="o">&lt;</span><span class="na">&#39;me</span><span class="o">&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="na">&#39;me</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Pin</span><span class="o">&lt;</span><span class="nb">Box</span><span class="o">&lt;</span><span class="k">dyn</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Option</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="na">&#39;me</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// This code allocates a box for the result
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// and coerces into a dyn:
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nb">Box</span>::<span class="n">pin</span><span class="p">(</span><span class="n">AsyncIter</span>::<span class="n">next</span><span class="p">(</span><span class="bp">self</span><span class="p">))</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>And finally we can implement the <code>AsyncIter</code> trait for the dynamic type:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="na">&#39;data</span><span class="p">,</span><span class="w"> </span><span class="n">Item</span><span class="o">&gt;</span><span class="w"> </span><span class="n">AsyncIter</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">DynAsyncIter</span><span class="o">&lt;</span><span class="na">&#39;data</span><span class="p">,</span><span class="w"> </span><span class="n">Item</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Next</span><span class="o">&lt;</span><span class="na">&#39;me</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">Item</span>: <span class="na">&#39;me</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="na">&#39;data</span>: <span class="na">&#39;me</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">=</span><span class="w"> </span><span class="n">Pin</span><span class="o">&lt;</span><span class="nb">Box</span><span class="o">&lt;</span><span class="k">dyn</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="na">&#39;me</span><span class="o">&gt;&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Next</span><span class="o">&lt;</span><span class="nb">&#39;_</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">pointer</span><span class="p">.</span><span class="n">next</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Yay, it all works, and without <em>any</em> unsafe code!</p>
<h2 id="what-id-like-to-see">What I&rsquo;d like to see</h2>
<p>This &ldquo;convert to dyn&rdquo; approach isn&rsquo;t really specific to async (as erased-serde shows). I&rsquo;d like to see a decorator that applies it to any trait. I imagine something like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// Generates the `DynAsyncIter` type shown above:
</span></span></span><span class="line"><span class="cl"><span class="cp">#[derive_dyn(DynAsyncIter)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">AsyncIter</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>But this ought to work with any <code>-&gt; impl Trait</code> return type, too, so long as <code>Trait</code> is dyn safe and implemented for <code>Box&lt;T&gt;</code>. So something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// Generates the `DynAsyncIter` type shown above:
</span></span></span><span class="line"><span class="cl"><span class="cp">#[derive_dyn(DynSillyIterTools)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">SillyIterTools</span>: <span class="nb">Iterator</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Iterate over the iter in pairs of two items.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">pair_up</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">impl</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="p">(</span><span class="bp">Self</span>::<span class="n">Item</span><span class="p">,</span><span class="w"> </span><span class="bp">Self</span>::<span class="n">Item</span><span class="p">)</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>would generate an erased trait that returns a <code>Box&lt;dyn Iterator&lt;(...)&gt;&gt;</code>. Similarly, you could do a trick with taking any <code>impl Foo</code> and passing in a <code>Box&lt;dyn Foo&gt;</code>, so you can support impl Trait in argument position.</p>
<p>Even without impl trait, <code>derive_dyn</code> would create a more ergonomic dyn to play with.</p>
<p>I don&rsquo;t really see this as a &ldquo;long term solution&rdquo;, but I would be interested to play with it.</p>
<h2 id="comments">Comments?</h2>
<p>I&rsquo;ve created a <a href="https://internals.rust-lang.org/t/blog-series-dyn-async-in-traits/15449">thread on internals</a> if you&rsquo;d like to comment on this post, or others in this series.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/dyn-async-traits" term="dyn-async-traits" label="Dyn async traits"/></entry><entry><title type="html">Dyn async traits, part 5</title><link href="https://smallcultfollowing.com/babysteps/blog/2021/10/14/dyn-async-traits-part-5/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2021/10/14/dyn-async-traits-part-5/</id><published>2021-10-14T00:00:00+00:00</published><updated>2021-10-14T13:46:00-04:00</updated><content type="html"><![CDATA[<p>If you’re willing to use nightly, you can already model async functions in traits by using GATs and impl Trait — this is what the <a href="https://github.com/embassy-rs/embassy">Embassy</a> async runtime does, and it’s also what the <a href="https://crates.io/crates/real-async-trait">real-async-trait</a> crate does. One shortcoming, though, is that your trait doesn’t support dynamic dispatch. In the previous posts of this series, I have been exploring some of the reasons for that limitation, and what kind of primitive capabilities need to be exposed in the language to overcome it. My thought was that we could try to stabilize those primitive capabilities with the plan of enabling experimentation. I am still in favor of this plan, but I realized something yesterday: <strong>using procedural macros, you can ALMOST do this experimentation today!</strong> Unfortunately, it doesn&rsquo;t quite work owing to some relatively obscure rules in the Rust type system (perhaps some clever readers will find a workaround; that said, these are rules I have wanted to change for a while).</p>
<p><strong>Just to be crystal clear:</strong> Nothing in this post is intended to describe an “ideal end state” for async functions in traits. I still want to get to the point where one can write <code>async fn</code> in a trait without any further annotation and have the trait be “fully capable” (support both static dispatch and dyn mode while adhering to the tenets of zero-cost abstractions<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>). But there are some significant questions there, and to find the best answers for those questions, we need to enable more exploration, which is the point of this post.</p>
<h3 id="code-is-on-github">Code is on github</h3>
<p>The code covered in this blog post has been prototyped and is <a href="https://github.com/nikomatsakis/ergo-dyn/blob/main/examples/async-iter-manual-desugar.rs">available on github</a>. See the caveat at the end of the post, though!</p>
<h3 id="design-goal">Design goal</h3>
<p>To see what I mean, let’s return to my favorite trait, <code>AsyncIter</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">AsyncIter</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The post is going to lay out how we can transform a trait declaration like the one above into a series of declarations that achieve the following:</p>
<ul>
<li>We can use it as a generic bound (<code>fn foo&lt;T: AsyncIter&gt;()</code>), in which case we get static dispatch, full auto trait support, and all the other goodies that normally come with generic bounds in Rust.</li>
<li>Given a <code>T: AsyncIter</code>, we can coerce it into some form of <code>DynAsyncIter</code> that uses virtual dispatch. In this case, the type doesn’t reveal the specific <code>T</code> or the specific types of the futures.
<ul>
<li>I wrote <code>DynAsyncIter</code>, and not <code>dyn AsyncIter</code> on purpose — we are going to create our own type that acts <em>like</em> a <code>dyn</code> type, but which manages the adaptations needed for async.</li>
<li>For simplicity, let’s assume we want to box the resulting futures. Part of the point of this design though is that it leaves room for us to generate whatever sort of wrapping types we want.</li>
</ul>
</li>
</ul>
<p>You could write the code I’m showing here by hand, but the better route would be to package it up as a kind of decorator (e.g., <code>#[async_trait_v2]</code><sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>).</p>
<h3 id="the-basics-trait-with-a-gat">The basics: trait with a GAT</h3>
<p>The first step is to transform the trait to have a GAT and a regular <code>fn</code>, in the way that we’ve seen many times:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">AsyncIter</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Next</span><span class="o">&lt;</span><span class="err">‘</span><span class="n">me</span><span class="o">&gt;</span>: <span class="nc">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Option</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">Self</span>: <span class="err">‘</span><span class="n">me</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Next</span><span class="o">&lt;</span><span class="err">‘</span><span class="n">_</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="next-define-a-dynasynciter-struct">Next: define a “DynAsyncIter” struct</h3>
<p>The next step is to manage the virtual dispatch (dyn) version of the trait. To do this, we are going to “roll our own” object by creating a struct <code>DynAsyncIter</code>. This struct plays the role of a <code>Box&lt;dyn AsyncIter&gt;</code> trait object. Instances of the struct can be created by calling <code>DynAsyncIter::from</code> with some specific iterator type; the <code>DynAsyncIter</code> type implements the <code>AsyncIter</code> trait, so once you have one you can just call <code>next</code> as usual:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">the_iter</span>: <span class="nc">DynAsyncIter</span><span class="o">&lt;</span><span class="kt">u32</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">DynAsyncIter</span>::<span class="n">from</span><span class="p">(</span><span class="n">some_iterator</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">process_items</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">the_iter</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">sum_items</span><span class="p">(</span><span class="n">iter</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="k">impl</span><span class="w"> </span><span class="n">AsyncIter</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kt">u32</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">u32</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">s</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">while</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">v</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">the_iter</span><span class="p">.</span><span class="n">next</span><span class="p">().</span><span class="k">await</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">s</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="n">v</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">s</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="struct-definition">Struct definition</h3>
<p>Let’s look at how this <code>DynAsyncIter</code> struct is defined. First, we are going to “roll our own” object by creating a struct <code>DynAsyncIter</code>. This struct is going to model a <code>Box&lt;dyn AsyncIter&gt;</code> trait object; it will have one generic parameter for every ordinary associated type declared in the trait (not including the GATs we introduced for async fn return types). The struct itself has two fields, the data pointer (a box, but in raw form) and a vtable. We don’t know the type of the underlying value, so we’ll use <code>ErasedData</code> for that:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">type</span> <span class="nc">ErasedData</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">DynAsyncIter</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span>: <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">ErasedData</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">vtable</span>: <span class="kp">&amp;</span><span class="err">’</span><span class="k">static</span><span class="w"> </span><span class="n">DynAsyncIterVtable</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>For the vtable, we will make a struct that contains a <code>fn</code> for each of the methods in the trait. Unlike the builtin vtables, we will modify the return type of these functions to be a boxed future:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">DynAsyncIterVtable</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">drop_fn</span>: <span class="nc">unsafe</span><span class="w"> </span><span class="k">fn</span><span class="p">(</span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">ErasedData</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">next_fn</span>: <span class="nc">unsafe</span><span class="w"> </span><span class="k">fn</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">ErasedData</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Box</span><span class="o">&lt;</span><span class="k">dyn</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="err">‘</span><span class="n">_</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="implementing-the-asynciter-trait">Implementing the AsyncIter trait</h3>
<p>Next, we can implement the <code>AsyncIter</code> trait for the <code>DynAsyncIter</code> type. For each of the new GATs we introduced, we simply use a boxed future type. For the method bodies, we extract the function pointer from the vtable and call it:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&gt;</span><span class="w"> </span><span class="n">AsyncIter</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">DynAsyncIter</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Next</span><span class="o">&lt;</span><span class="err">‘</span><span class="n">me</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Box</span><span class="o">&lt;</span><span class="k">dyn</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="err">‘</span><span class="n">me</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Next</span><span class="o">&lt;</span><span class="err">‘</span><span class="n">_</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">next_fn</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">vtable</span><span class="p">.</span><span class="n">next_fn</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">next_fn</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">data</span><span class="p">)</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">   </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The unsafe keyword here is asserting that the safety conditions of <code>next_fn</code> are met. We’ll cover that in more detail later, but in short those conditions are:</p>
<ul>
<li>The vtable corresponds to some erased type <code>T: AsyncIter</code>…</li>
<li>…and each instance of <code>*mut ErasedData</code> points to a valid <code>Box&lt;T&gt;</code> for that type.</li>
</ul>
<h3 id="dropping-the-object">Dropping the object</h3>
<p>Speaking of Drop, we do need to implement that as well. It too will call through the vtable:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="nb">Drop</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">DynAsyncIter</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">drop</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">drop_fn</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">vtable</span><span class="p">.</span><span class="n">drop_fn</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">drop_fn</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">data</span><span class="p">);</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>We need to call through the vtable because we don’t know what kind of data we have, so we can’t know how to drop it correctly.</p>
<h3 id="creating-an-instance-of-dynasynciter">Creating an instance of <code>DynAsyncIter</code></h3>
<p>To create one of these <code>DynAsyncIter</code> objects, we can implement the <code>From</code> trait. This allocates a box, coerces it into a raw pointer, and then combines that with the vtable:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">Item</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">From</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">DynAsyncIter</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">T</span>: <span class="nc">AsyncIter</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Item</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">from</span><span class="p">(</span><span class="n">value</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">DynAsyncIter</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">boxed_value</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Box</span>::<span class="n">new</span><span class="p">(</span><span class="n">value</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">DynAsyncIter</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">data</span>: <span class="nb">Box</span>::<span class="n">into_raw</span><span class="p">(</span><span class="n">boxed_value</span><span class="p">)</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="p">(),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">vtable</span>: <span class="nc">dyn_async_iter_vtable</span>::<span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(),</span><span class="w"> </span><span class="c1">// we’ll cover this fn later
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="creating-the-vtable-shims">Creating the vtable shims</h3>
<p>Now we come to the most interesting part: how do we create the vtable for one of these objects? Recall that our vtable was a struct like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">DynAsyncIterVtable</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">drop_fn</span>: <span class="nc">unsafe</span><span class="w"> </span><span class="k">fn</span><span class="p">(</span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">ErasedData</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">next_fn</span>: <span class="nc">unsafe</span><span class="w"> </span><span class="k">fn</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">ErasedData</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Box</span><span class="o">&lt;</span><span class="k">dyn</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="err">‘</span><span class="n">_</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>We are going to need to create the values for each of those fields. In an ordinary <code>dyn</code>, these would be pointers directly to the methods from the <code>impl</code>, but for us they are “wrapper functions” around the core trait functions. The role of these wrappers is to introduce some minor coercions, such as allocating a box for the resulting future, as well as to adapt from the “erased data” to the true type:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// Safety conditions:
</span></span></span><span class="line"><span class="cl"><span class="c1">//
</span></span></span><span class="line"><span class="cl"><span class="c1">// The `*mut ErasedData` is actually the raw form of a `Box&lt;T&gt;` 
</span></span></span><span class="line"><span class="cl"><span class="c1">// that is valid for ‘a.
</span></span></span><span class="line"><span class="cl"><span class="k">unsafe</span><span class="w"> </span><span class="k">fn</span> <span class="nf">next_wrapper</span><span class="o">&lt;</span><span class="err">‘</span><span class="n">a</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">this</span>: <span class="kp">&amp;</span><span class="err">’</span><span class="n">a</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">ErasedData</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Box</span><span class="o">&lt;</span><span class="k">dyn</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">T</span>::<span class="n">Item</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="err">‘</span><span class="n">a</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">T</span>: <span class="nc">AsyncIter</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">unerased_this</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="nb">Box</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="o">*</span><span class="p">(</span><span class="n">this</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="nb">Box</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">future</span>: <span class="nc">T</span>::<span class="n">Next</span><span class="o">&lt;</span><span class="err">‘</span><span class="n">_</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&lt;</span><span class="n">T</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="n">AsyncIter</span><span class="o">&gt;</span>::<span class="n">next</span><span class="p">(</span><span class="n">unerased_this</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nb">Box</span>::<span class="n">new</span><span class="p">(</span><span class="n">future</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>We’ll also need a “drop” wrapper:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// Safety conditions:
</span></span></span><span class="line"><span class="cl"><span class="c1">//
</span></span></span><span class="line"><span class="cl"><span class="c1">// The `*mut ErasedData` is actually the raw form of a `Box&lt;T&gt;` 
</span></span></span><span class="line"><span class="cl"><span class="c1">// and this function is being given ownership of it.
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">drop_wrapper</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">this</span>: <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">ErasedData</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">T</span>: <span class="nc">AsyncIter</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">unerased_this</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Box</span>::<span class="n">from_raw</span><span class="p">(</span><span class="n">this</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">T</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nb">drop</span><span class="p">(</span><span class="n">unerased_this</span><span class="p">);</span><span class="w"> </span><span class="c1">// Execute destructor as normal
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="constructing-the-vtable">Constructing the vtable</h3>
<p>Now that we’ve defined the wrappers, we can construct the vtable itself. Recall that the <code>From</code> impl called a function <code>dyn_async_iter_vtable::&lt;T&gt;</code>. That function looks like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">dyn_async_iter_vtable</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="kp">&amp;</span><span class="err">’</span><span class="k">static</span><span class="w"> </span><span class="n">DynAsyncIterVtable</span><span class="o">&lt;</span><span class="n">T</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">T</span>: <span class="nc">AsyncIter</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">const</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">&amp;</span><span class="n">DynAsyncIterVtable</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">drop_fn</span>: <span class="nc">drop_wrapper</span>::<span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">next_fn</span>: <span class="nc">next_wrapper</span>::<span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This constructs a struct with the two function pointers: this struct only contains static data, so we are allowed to return a <code>&amp;’static</code> reference to it.</p>
<p>Done!</p>
<h3 id="and-now-the-caveat-and-a-plea-for-help">And now the caveat, and a plea for help</h3>
<p>Unfortunately, this setup doesn&rsquo;t work quite how I described it. There are two problems:</p>
<ul>
<li><code>const</code> functions and expressions stil lhave a lot of limitations, especially around generics like <code>T</code>, and I couldn&rsquo;t get them to work;</li>
<li>Because of the rules introduced by <a href="https://rust-lang.github.io/rfcs/1214-projections-lifetimes-and-wf.html">RFC 1214</a>, the <code>&amp;’static DynAsyncIterVtable&lt;T::Item&gt;</code> type requires that <code>T::Item: 'static</code>, which may not be true here. This condition perhaps shouldn&rsquo;t be necessary, but the compiler currently enforces it.</li>
</ul>
<p>I wound up hacking something terrible that erased the <code>T::Item</code> type into uses and used <code>Box::leak</code> to get a <code>&amp;'static</code> reference, just to prove out the concept. I&rsquo;m almost embarassed to <a href="https://github.com/nikomatsakis/ergo-dyn/blob/3503770e08177a6d59e202f88cb7227863331685/examples/async-iter-manual-desugar.rs#L107-L118">show the code</a>, but there it is.</p>
<p>Anyway, I know people have done some pretty clever tricks, so I&rsquo;d be curious to know if I&rsquo;m missing something and there <em>is</em> a way to build this vtable on Rust today. Regardless, it seems like extending <code>const</code> and a few other things to support this case is a relatively light lift, if we wanted to do that.</p>
<h3 id="conclusion">Conclusion</h3>
<p>This blog post presented a way to implement the dyn dispatch ideas I&rsquo;ve been talking using only features that currently exist and are generally en route to stabilization. That&rsquo;s exiting to me, because it means that we can start to do measurements and experimentation. For example, I would really like to know the performance impact of transitiong from <code>async-trait</code> to a scheme that uses a combination of static dispatch and boxed dynamic dispatch as described here. I would also like to explore whether there are other ways to wrap futures (e.g., with task-local allocators or other smart pointers) that might perform better. This would help inform what kind of capabilities we ultimately need.</p>
<p>Looking beyond async, I&rsquo;m interested in tinkering with different models for <code>dyn</code> in general. As an obvious example, the &ldquo;always boxed&rdquo; version I implemented here has some runtime cost (an allocation!) and isn&rsquo;t applicable in all environments, but it would be far more ergonomic. Trait objects would be Sized and would transparently work in far more contexts. We can also prototype different kinds of vtable adaptation.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>In the words of Bjarne Stroustroup, “What you don’t use, you don’t pay for. And further: What you do use, you couldn’t hand code any better.”&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Egads, I need a snazzier name than that!&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/dyn-async-traits" term="dyn-async-traits" label="Dyn async traits"/></entry><entry><title type="html">CTCFT 2021-10-18 Agenda</title><link href="https://smallcultfollowing.com/babysteps/blog/2021/10/13/ctcft-2021-10-18-agenda/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2021/10/13/ctcft-2021-10-18-agenda/</id><published>2021-10-13T00:00:00+00:00</published><updated>2021-10-13T17:13:00-04:00</updated><content type="html"><![CDATA[<p><img src="https://raw.githubusercontent.com/rust-ctcft/ctcft/main/img/camprust.png" width="222" style="float:left;"/> The next <a href="https://rust-ctcft.github.io/ctcft/">&ldquo;Cross Team Collaboration Fun Times&rdquo; (CTCFT)</a> meeting will take place next Monday, on 2021-10-18 (<a href="https://everytimezone.com/s/b65371cd">in your time zone</a>)! This post covers the agenda. You&rsquo;ll find the full details (along with a calendar event, zoom details, etc) <a href="https://rust-ctcft.github.io/ctcft/meetings/2021-10-18.html">on the CTCFT website</a>.</p>
<div style="clear:both;"></div>
<h3 id="agenda">Agenda</h3>
<p>The theme for this meeting is exploring ways to empower and organize contributors.</p>
<ul>
<li>(5 min) Opening remarks 👋 (<a href="https://github.com/nikomatsakis">nikomatsakis</a>)</li>
<li>(5 min) CTCFT update (<a href="https://github.com/angelonfira">angelonfira</a>)</li>
<li>(20 min) Sprints and groups implementing the async vision doc (<a href="https://github.com/tmandry">tmandry</a>)</li>
<li>(15 min) rust-analyzer talk (TBD)
<ul>
<li>The <code>rust-analyzer</code> project aims to succeed RLS as the official language server for Rust. We talk about how it differs from RLS, how it is developed, and what to expect in the future.</li>
</ul>
</li>
<li>(10 min) Contributor survey (<a href="https://github.com/yaahc">yaahc</a>)
<ul>
<li>Introducing the contributor survey, it&rsquo;s goals, methodology, and soliciting community feedback</li>
</ul>
</li>
<li>(5 min) Closing (<a href="https://github.com/nikomatsakis">nikomatsakis</a>)</li>
</ul>
<h3 id="afterwards-social-hour">Afterwards: Social hour</h3>
<p>After the CTCFT this week, we are going to try an experimental <strong>social hour</strong>. The hour will be coordinated in the #ctcft stream of the rust-lang Zulip. The idea is to create breakout rooms where people can gather to talk, hack together, or just chill.</p>
]]></content></entry><entry><title type="html">Dyn async traits, part 4</title><link href="https://smallcultfollowing.com/babysteps/blog/2021/10/07/dyn-async-traits-part-4/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2021/10/07/dyn-async-traits-part-4/</id><published>2021-10-07T00:00:00+00:00</published><updated>2021-10-07T12:33:00-04:00</updated><content type="html"><![CDATA[<p>In the previous post, I talked about how we could write our own <code>impl Iterator for dyn Iterator</code> by adding a few primitives. In this post, I want to look at what it would take to extend that to an async iterator trait. As before, I am interested in exploring the “core capabilities” that would be needed to make everything work.</p>
<h2 id="start-somewhere-just-assume-we-want-box">Start somewhere: Just assume we want Box</h2>
<p>In the <a href="https://smallcultfollowing.com/babysteps/
/blog/2021/09/30/dyn-async-traits-part-1/#conclusion-ideally-we-want-box-when-using-dyn-but-not-otherwise">first post of this series</a>, we talked about how invoking an async fn through a dyn trait should to have the return type of that async fn be a <code>Box&lt;dyn Future&gt;</code> — but only when calling it through a dyn type, not all the time.</p>
<p>Actually, that’s a slight simplification: <code>Box&lt;dyn Future&gt;</code> is certainly one type we could use, but there are other types you might want:</p>
<ul>
<li><code>Box&lt;dyn Future + Send&gt;</code>, to indicate that the future is sendable across threads;</li>
<li>Some other wrapper type besides <code>Box</code>.</li>
</ul>
<p>To keep things simple, I’m just going to look at <code>Box&lt;dyn Future&gt;</code> in this post. We’ll come back to some of those extensions later.</p>
<h2 id="background-running-example">Background: Running example</h2>
<p>Let’s start by recalling the <code>AsyncIter</code> trait:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">AsyncIter</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Remember that when we “desugared” this <code>async fn</code>, we introduced a new (generic) associated type for the future returned by <code>next</code>, called <code>Next</code> here:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">AsyncIter</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Next</span><span class="o">&lt;</span><span class="na">&#39;me</span><span class="o">&gt;</span>: <span class="nc">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="na">&#39;me</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Next</span><span class="o">&lt;</span><span class="nb">&#39;_</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>We were working with a struct <code>SleepyRange</code> that implements <code>AsyncIter</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">SleepyRange</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="err">…</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">AsyncIter</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">SleepyRange</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kt">u32</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="err">…</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="background-associated-types-in-a-static-vs-dyn-context">Background: Associated types in a static vs dyn context</h2>
<p>Using an associated type is great in a static context, because it means that when you call <code>sleepy_range.next()</code>, we are able to resolve the returned future type precisely. This helps us to allocate exactly as much stack as is needed and so forth.</p>
<p>But in a dynamic context, i.e. if you have <code>some_iter: Box&lt;dyn AsyncIter&gt;</code> and you invoke <code>some_iter.next()</code>, that’s a liability. The whole point of using <code>dyn</code> is that we don’t know exactly what implementation of <code>AsyncIter::next</code> we are invoking, so we can’t know exactly what future type is returned. Really, we just want to get back a <code>Box&lt;dyn Future&lt;Output = Option&lt;u32&gt;&gt;&gt;</code> — or something very similar.</p>
<h2 id="how-could-we-have-a-trait-that-boxes-futures-but-only-when-using-dyn">How could we have a trait that boxes futures, but only when using dyn?</h2>
<p>If we want the trait to only box futures when using <code>dyn</code>, there are two things we need.</p>
<p><strong>First, we need to change the <code>impl AsyncIter for dyn AsyncIter</code>.</strong> In the compiler today, it generates an impl which is generic over the value of every associated type. But we want an impl that is generic over the value of the <code>Item</code> type, but which <em>specifies</em> the value of the <code>Next</code> type to be <code>Box&lt;dyn Future&gt;</code>. This way, we are effectively saying that “when you call the <code>next</code> method on a <code>dyn AsyncIter</code>, you always get a <code>Box&lt;dyn Future&gt;</code> back” (but when you call the <code>next</code> method on a specific type, such as a <code>SleepyRange</code>, you would get back a different type — the actual future type, not a boxed version). If we were to write that dyn impl in Rust code, it might look something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">I</span><span class="o">&gt;</span><span class="w"> </span><span class="n">AsyncIter</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="k">dyn</span><span class="w"> </span><span class="n">AsyncIter</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">I</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">I</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Next</span><span class="o">&lt;</span><span class="na">&#39;me</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Box</span><span class="o">&lt;</span><span class="k">dyn</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">I</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="err">‘</span><span class="n">me</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Next</span><span class="o">&lt;</span><span class="nb">&#39;_</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="cm">/* see below */</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The body of the <code>next</code> function is code that extracts the function pointer from the vtable and calls it. Something like this, relying on the APIs from [RFC 2580] along with the function <code>associated_fn</code> that I sketched in the previous post:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Next</span><span class="o">&lt;</span><span class="err">‘</span><span class="n">_</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">RuntimeType</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">data_pointer</span>: <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">RuntimeType</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">vtable</span>: <span class="nc">DynMetadata</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ptr</span>::<span class="n">metadata</span><span class="p">(</span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">fn_pointer</span>: <span class="nc">fn</span><span class="p">(</span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">RuntimeType</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Box</span><span class="o">&lt;</span><span class="k">dyn</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">I</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="err">‘</span><span class="n">_</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">associated_fn</span>::<span class="o">&lt;</span><span class="n">AsyncIter</span>::<span class="n">next</span><span class="o">&gt;</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">fn_pointer</span><span class="p">(</span><span class="n">data</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This is still the code we want. However, there is a slight wrinkle.</p>
<h2 id="constructing-the-vtable-async-functions-need-a-shim-to-return-a-box">Constructing the vtable: Async functions need a shim to return a <code>Box</code></h2>
<p>In the <code>next</code> method above, the type of the function pointer that we extracted from the vtable was the following:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span><span class="p">(</span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">RuntimeType</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Box</span><span class="o">&lt;</span><span class="k">dyn</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">I</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="err">‘</span><span class="n">_</span><span class="o">&gt;</span><span class="w">
</span></span></span></code></pre></div><p>However, the signature of the function in the impl is different! It doesn’t return a <code>Box</code>, it returns an <code>impl Future</code>! Somehow we have to bridge this gap. What we need is a kind of “shim function”, something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">next_box_shim</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">AsyncIter</span><span class="o">&gt;</span><span class="p">(</span><span class="n">this</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">T</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Box</span><span class="o">&lt;</span><span class="k">dyn</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">I</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="err">‘</span><span class="n">_</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">future</span>: <span class="nc">impl</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">I</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">AsyncIter</span>::<span class="n">next</span><span class="p">(</span><span class="n">this</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nb">Box</span>::<span class="n">new</span><span class="p">(</span><span class="n">future</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now the vtable for <code>SleepyRange</code> can store <code>next_box_shim::&lt;SleepyRange&gt;</code> instead of storing <code>&lt;SleepyRange as AsyncIter&gt;::next</code> directly.</p>
<h2 id="extending-the-associatedfn-trait">Extending the <code>AssociatedFn</code> trait</h2>
<p>In my previous post, I sketched out the idea of an <code>AssociatedFn</code> trait that had an associated type <code>FnPtr</code>. If we wanted to make the construction of this sort of shim automated, we would want to change that from an associated type into its own trait. I’m imagining something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">AssociatedFn</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Reify</span><span class="o">&lt;</span><span class="n">F</span><span class="o">&gt;</span>: <span class="nc">AssociatedFn</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">reify</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">F</span><span class="p">;</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>where <code>A: Reify&lt;F&gt;</code> indicates that the associated function <code>A</code> can be “reified” (made into a function pointer) for a function type <code>F</code>. The compiler could implement this trait for the direct mapping where possible, but also for various kinds of shims and ABI transformations. For example, the <code>AsyncIter::next</code> method might implement<code>Reify&lt;fn(*mut ()) -&gt; Box&lt;dyn Future&lt;..&gt;&gt;&gt;</code> to allow a “boxing shim” to be constructed and so forth.</p>
<h2 id="other-sorts-of-shims">Other sorts of shims</h2>
<p>There are other sorts of limitations around dyn traits that could be overcome with judicious use of shims and tweaked vtables, at least in some cases. As an example, consider this trait:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">trait</span><span class="w"> </span><span class="n">Append</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">append</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">values</span>: <span class="nc">impl</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kt">u32</span><span class="o">&gt;</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This trait is not traditionally dyn-safe because the <code>append</code> function is generic and requires monomorphization for each kind of iterator — therefore, we don’t know which version to put in the vtable for <code>Append</code>, since we don’t yet know the types of iterators it will be applied to! But what if we just put <em>one</em> version, the case where the iterator type is <code>&amp;mut dyn Iterator&lt;Item = u32&gt;</code>? We could then tweak the <code>impl Append for dyn Append</code> to create this <code>&amp;mut dyn Iterator</code> and call the function from the vtable:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Append</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="k">dyn</span><span class="w"> </span><span class="n">Append</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">append</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">values</span>: <span class="nc">impl</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kt">u32</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">values_dyn</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="k">dyn</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kt">u32</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">values</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">type</span> <span class="nc">RuntimeType</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">data_pointer</span>: <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">RuntimeType</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">vtable</span>: <span class="nc">DynMetadata</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ptr</span>::<span class="n">metadata</span><span class="p">(</span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">f</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">associated_fn</span>::<span class="o">&lt;</span><span class="n">Append</span>::<span class="n">append</span><span class="o">&gt;</span><span class="p">(</span><span class="n">vtable</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">f</span><span class="p">(</span><span class="n">data_pointer</span><span class="p">,</span><span class="w"> </span><span class="n">values_dyn</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="conclusion">Conclusion</h2>
<p>So where does this leave us? The core building blocks for “dyn async traits” seem to be:</p>
<ul>
<li>The ability to customize the contents of the vtable that gets generated for a trait.
<ul>
<li>For example, async fns need shim functions that box the output.</li>
</ul>
</li>
<li>The ability to customize the dispatch logic (<code>impl Foo for dyn Foo</code>).</li>
<li>The ability to customize associated types like <code>Next</code> to be a <code>Box&lt;dyn&gt;</code>:
<ul>
<li>This requires the ability to extract the vtable, as given by [RFC 2580].</li>
<li>It also requires the ability to extract functions from the vtable (not presently supported).</li>
</ul>
</li>
</ul>
<p>I said at the outset that I was going to assume, for the purposes of this post, that we wanted to return a <code>Box&lt;dyn&gt;</code>, and I have.  It seems possible to extend these core capabilities to other sorts of return types (such as other smart pointers), but it’s not entirely trivial; we’d have to define what kinds of shims the compiler can generate.</p>
<p>I haven’t really thought very hard about how we might allow users to specify each of those building blocks, though I sketched out some possibilities. At this point, I’m mostly trying to explore the possibilities of what kinds of capabilities may be useful or necessary to expose.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/dyn-async-traits" term="dyn-async-traits" label="Dyn async traits"/></entry><entry><title type="html">Dyn async traits, part 3</title><link href="https://smallcultfollowing.com/babysteps/blog/2021/10/06/dyn-async-traits-part-3/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2021/10/06/dyn-async-traits-part-3/</id><published>2021-10-06T00:00:00+00:00</published><updated>2021-10-06T11:06:00-04:00</updated><content type="html"><![CDATA[<p>In the previous &ldquo;dyn async traits&rdquo; posts, I talked about how we can think about the compiler as synthesizing an impl that performed the dynamic dispatch. In this post, I wanted to start explore a theoretical future in which this impl was written manually by the Rust programmer. This is in part a thought exercise, but it’s also a possible ingredient for a future design: if we could give programmers more control over the “impl Trait for dyn Trait” impl, then we could enable a lot of use cases.</p>
<h3 id="example">Example</h3>
<p>For this post, <code>async fn</code> is kind of a distraction. Let’s just work with a simplified <code>Iterator</code> trait:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>As we discussed in the previous post, the compiler today generates an impl that is something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">I</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="k">dyn</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">I</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">I</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="n">I</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">type</span> <span class="nc">RuntimeType</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">data_pointer</span>: <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">RuntimeType</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">vtable</span>: <span class="nc">DynMetadata</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ptr</span>::<span class="n">metadata</span><span class="p">(</span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">fn_pointer</span>: <span class="nc">fn</span><span class="p">(</span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">RuntimeType</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="n">I</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">__get_next_fn_pointer__</span><span class="p">(</span><span class="n">vtable</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">fn_pointer</span><span class="p">(</span><span class="n">data</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This code draws on the APIs from <a href="https://rust-lang.github.io/rfcs/2580-ptr-meta.html">RFC 2580</a>, along with a healthy dash of “pseduo-code”. Let’s see what it does:</p>
<h4 id="extracting-the-data-pointer">Extracting the data pointer</h4>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">type</span> <span class="nc">RuntimeType</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">data_pointer</span>: <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">RuntimeType</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="p">();</span><span class="w">
</span></span></span></code></pre></div><p>Here, <code>self</code> is a wide pointer of type <code>&amp;mut dyn Iterator&lt;Item = I&gt;</code>. The rules for <code>as</code> state that casting a wide pointer to a thin pointer drops the metadata<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>, so we can (ab)use that to get the data pointer. Here I just gave the pointer the type <code>*mut RuntimeType</code>, which is an alias for <code>*mut ()</code> — i.e., raw pointer to something. The type alias <code>RuntimeType</code> is meant to signify “whatever type of data we have at runtime”. Using <code>()</code> for this is a hack; the “proper” way to model it would be with an existential type. But since Rust doesn’t have those, and I’m not keen to add them if we don’t have to, we’ll just use this type alias for now.</p>
<h4 id="extracting-the-vtable-or-dynmetadata">Extracting the vtable (or <code>DynMetadata</code>)</h4>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">vtable</span>: <span class="nc">DynMetadata</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ptr</span>::<span class="n">metadata</span><span class="p">(</span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span></code></pre></div><p>The <a href="https://doc.rust-lang.org/std/ptr/fn.metadata.html"><code>ptr::metadata</code></a> function was added in <a href="https://rust-lang.github.io/rfcs/2580-ptr-meta.html">RFC 2580</a>. Its purpose is to extract the “metadata” from a wide pointer. The type of this metadata depends on the type of wide pointer you have: this is determined by the <a href="https://doc.rust-lang.org/std/ptr/trait.Pointee.html"><code>Pointee</code></a> trait[^noreferent]. For <code>dyn</code> types, the metadata is a <a href="https://doc.rust-lang.org/std/ptr/struct.DynMetadata.html"><code>DynMetadata</code></a>, which just means “pointer to the vtable”. In today’s APIs, the <a href="https://doc.rust-lang.org/std/ptr/struct.DynMetadata.html"><code>DynMetadata</code></a> is pretty limited: it lets you extract the size/alignment of the underlying <code>RuntimeType</code>, but it doesn’t give any access to the actual function pointers that are inside.</p>
<h4 id="extracting-the-function-pointer-from-the-vtable">Extracting the function pointer from the vtable</h4>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">fn_pointer</span>: <span class="nc">fn</span><span class="p">(</span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">RuntimeType</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="n">I</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">__get_next_fn_pointer__</span><span class="p">(</span><span class="n">vtable</span><span class="p">);</span><span class="w">
</span></span></span></code></pre></div><p>Now we get to the pseudocode. <em>Somehow</em>, we need a way to get the fn pointer out from the vtable. At runtime, the way this works is that each method has an assigned offset within the vtable, and you basically do an array lookup; kind of like <code>vtable.methods()[0]</code>, where <code>methods()</code> returns a array <code>&amp;[fn()]</code> of function pointers. The problem is that there’s a lot of “dynamic typing” going on here: the signature of each one of those methods is going to be different. Moreover, we’d like some freedom to change how vtables are laid out. For example, the ongoing (and awesome!) work on dyn upcasting by <a href="https://github.com/crlf0710">Charles Lew</a> has required modifying our <a href="https://rust-lang.github.io/dyn-upcasting-coercion-initiative/design-discussions/vtable-layout.html">vtable layout</a>, and I expect further modification as we try to support <code>dyn</code> types with multiple traits, like <code>dyn Debug + Display</code>.</p>
<p>So, for now, let’s just leave this as pseudocode. Once we’ve finished walking through the example, I’ll return to this question of how we might model <code>__get_next_fn_pointer__</code> in a forwards compatible way.</p>
<p>One thing worth pointing out: the type of <code>fn_pointer</code> is a <code>fn(*mut RuntimeType) -&gt; Option&lt;I&gt;</code>. There are two interesting things going on here:</p>
<ul>
<li>The argument has type <code>*mut RuntimeType</code>: using the type alias indicates that this function is known to take a single pointer (in fact, it’s a reference, but those have the same layout). This pointer is expected to point to the same runtime data that <code>self</code> points at — we don’t know what it is, but we know that they’re the same. This works because <code>self</code> paired together a pointer to some data of type <code>RuntimeType</code> along with a vtable of functions that expect <code>RuntimeType</code> references.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></li>
<li>The return type is <code>Option&lt;I&gt;</code>, where <code>I</code> is the item type: this is interesting because although we don’t know statically what the <code>Self</code> type is, we <em>do</em> know the <code>Item</code> type. In fact, we will generate a distinct copy of this impl for every kind of item. This allows us to easily pass the return value.</li>
</ul>
<h4 id="calling-the-function">Calling the function</h4>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">fn_pointer</span><span class="p">(</span><span class="n">data</span><span class="p">)</span><span class="w">
</span></span></span></code></pre></div><p>The final line in the code is very simple: we call the function! It returns an <code>Option&lt;I&gt;</code> and we can return that to our caller.</p>
<h3 id="returning-to-the-pseudocode">Returning to the pseudocode</h3>
<p>We relied on one piece of pseudocode in that imaginary impl:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">fn_pointer</span>: <span class="nc">fn</span><span class="p">(</span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">RuntimeType</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="n">I</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">__get_next_fn_pointer__</span><span class="p">(</span><span class="n">vtable</span><span class="p">);</span><span class="w">
</span></span></span></code></pre></div><p>So how could we possibly turn <code>__get_next_fn_pointer__</code> from pseudocode into real code? There are two things worth noting:</p>
<ul>
<li>First, the name of this function already encodes the method we want (<code>next</code>). We probably don’t want to generate an infinite family of these “getter” functions.</li>
<li>Second, the signature of the function is specific to the method we want, since it returns a <code>fn</code> type(<code>fn *mut RuntimeType) -&gt; Option&lt;I&gt;</code>) that encodes the signature for <code>next</code> (with the self type changed, of course). This seems better than just returning a generic signature like <code>fn()</code> that must be cast manually by the user; less opportunity for error.</li>
</ul>
<h3 id="using-zero-sized-fn-types-as-the-basis-for-an-api">Using zero-sized fn types as the basis for an API</h3>
<p>One way to solve these problems would be to build on the trait system. Imagine there were a type for every method, let’s call it <code>A</code>, and that this type implemented a trait like <code>AssociatedFn</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">AssociatedFn</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// The type of the associated function, but as a `fn` pointer
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// with the self type erased. This is the type that would be
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// encoded in the vtable.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">FnPointer</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="err">…</span><span class="w"> </span><span class="c1">// maybe other things
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>We could then define a generic “get function pointer” function like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">associated_fn</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span><span class="p">(</span><span class="n">vtable</span>: <span class="nc">DynMetadata</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">A</span>::<span class="n">FnPtr</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">A</span>: <span class="nc">AssociatedFn</span><span class="w">
</span></span></span></code></pre></div><p>Now instead of <code>__get_next_fn_pointer__</code>, we can write</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">type</span> <span class="nc">NextMethodType</span><span class="w"> </span><span class="o">=</span><span class="w">  </span><span class="cm">/* type corresponding to the next method */</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">fn_pointer</span>: <span class="nc">fn</span><span class="p">(</span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">RuntimeType</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="n">I</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">   </span><span class="n">associated_fn</span>::<span class="o">&lt;</span><span class="n">NextMethodType</span><span class="o">&gt;</span><span class="p">(</span><span class="n">vtable</span><span class="p">);</span><span class="w">
</span></span></span></code></pre></div><p>Ah, but what is this <code>NextMethodType</code>? How do we <em>get</em> the type for the next method? Presumably we’d have to introduce some syntax, like <code>Iterator::item</code>.</p>
<h3 id="related-concept-zero-sized-fn-types">Related concept: zero-sized fn types</h3>
<p>This idea of a type for associated functions is <em>very close</em> (but not identical) to an already existing concept in Rust: zero-sized function types. As you may know, the type of a Rust function is in fact a special zero-sized type that uniquely identifies the function. There is (presently, anyway) no syntax for this type, but you can observe it by printing out the size of values (<a href="https://play.rust-lang.org/?version=stable&amp;mode=debug&amp;edition=2018&amp;gist=e30569b1f9e4a36e436b7335627dd1ba">playground</a>):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// The type of `f` is not `fn()`. It is a special, zero-sized type that uniquely
</span></span></span><span class="line"><span class="cl"><span class="c1">// identifies `foo`
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">f</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">foo</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="fm">println!</span><span class="p">(</span><span class="err">“</span><span class="p">{}</span><span class="err">”</span><span class="p">,</span><span class="w"> </span><span class="n">sizeof_value</span><span class="p">(</span><span class="o">&amp;</span><span class="n">f</span><span class="p">));</span><span class="w"> </span><span class="c1">// prints 0
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// This type can be coerced to `fn()`, which is a function pointer
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">g</span>: <span class="nc">fn</span><span class="p">()</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">f</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="fm">println!</span><span class="p">(</span><span class="err">“</span><span class="p">{}</span><span class="err">”</span><span class="p">,</span><span class="w"> </span><span class="n">sizeof_value</span><span class="p">(</span><span class="o">&amp;</span><span class="n">g</span><span class="p">));</span><span class="w"> </span><span class="c1">// prints 8
</span></span></span></code></pre></div><p>There are also types for functions that appear in impls. For example, you could get an instance of the type that represents the <code>next</code> method on <code>vec::IntoIter&lt;u32&gt;</code> like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&lt;</span><span class="n">vec</span>::<span class="n">IntoIter</span><span class="o">&lt;</span><span class="kt">u32</span><span class="o">&gt;</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&gt;</span>::<span class="n">next</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="fm">println!</span><span class="p">(</span><span class="err">“</span><span class="p">{}</span><span class="err">”</span><span class="p">,</span><span class="w"> </span><span class="n">sizeof_value</span><span class="p">(</span><span class="o">&amp;</span><span class="n">f</span><span class="p">));</span><span class="w"> </span><span class="c1">// prints 0
</span></span></span></code></pre></div><h3 id="where-the-zero-sized-types-dont-fit">Where the zero-sized types don’t fit</h3>
<p>The existing zero-sized types can’t be used for our “associated function” type for two reasons:</p>
<ul>
<li>You can’t name them! We can fix this by adding syntax.</li>
<li>There is no zero-sized type for a <em>trait function independent of an impl</em>.</li>
</ul>
<p>The latter point is subtle<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>. Before, when I talked about getting the type for a function from an impl, you’ll note that I gave a fully qualified function name, which specified the <code>Self</code> type precisely:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&lt;</span><span class="n">vec</span>::<span class="n">IntoIter</span><span class="o">&lt;</span><span class="kt">u32</span><span class="o">&gt;</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&gt;</span>::<span class="n">next</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">//       ^^^^^^^^^^^^^^^^^^ the Self type
</span></span></span></code></pre></div><p>But what we want in our impl is to write code that doesn’t know what the Self type is! So this type that exists in the Rust type system today isn’t quite what we need. But it’s very close.</p>
<h3 id="conclusion">Conclusion</h3>
<p>I’m going to leave it here. Obviously, I haven’t presented any kind of final design, but we’ve seen a lot of tantalizing ingredients:</p>
<ul>
<li>Today, the compiler generates a <code>impl Iterator for dyn Iterator</code> that extract functions from a vtable and invokes them by magic.</li>
<li>But, using the APIs from <a href="https://rust-lang.github.io/rfcs/2580-ptr-meta.html">RFC 2580</a>, you can <em>almost</em> write the by hand. What is missing is a way to extract a function pointer from a vtable, and what makes <em>that</em> hard is that we need a way to identify the function we are extracting</li>
<li>We have zero-sized types that represent functions today, but we don’t have a way to name them, and we don’t have zero-sized types for functions in traits, only in impls.</li>
</ul>
<p>Of course, all of the stuff I wrote here was just about normal functions. We still need to circle back to async functions, which add a few extra wrinkles. Until next time!</p>
<h3 id="footnotes">Footnotes</h3>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>I don’t actually like these rules, which have bitten me a few times. I think we should introduce an accessor function, but I didn’t see one in <a href="https://rust-lang.github.io/rfcs/2580-ptr-meta.html">RFC 2580</a> — maybe I missed it, or it already exists.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>If you used unsafe code to pair up a random pointer with an unrelated vtable, then hilarity would ensue here, as there is no runtime checking that these types line up.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>And, in fact, I didn’t see it until I was writing this blog post!&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/dyn-async-traits" term="dyn-async-traits" label="Dyn async traits"/></entry><entry><title type="html">Dyn async traits, part 2</title><link href="https://smallcultfollowing.com/babysteps/blog/2021/10/01/dyn-async-traits-part-2/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2021/10/01/dyn-async-traits-part-2/</id><published>2021-10-01T00:00:00+00:00</published><updated>2021-10-01T11:56:00-04:00</updated><content type="html"><![CDATA[<p>In the <a href="https://smallcultfollowing.com/babysteps/
/blog/2021/09/30/dyn-async-traits-part-1/">previous post</a>, we uncovered a key challenge for <code>dyn</code> and async traits: the fact that, in Rust today, <code>dyn</code> types have to specify the values for all associated types. This post is going to dive into more background about how dyn traits work today, and in particular it will talk about where that limitation comes from.</p>
<h3 id="today-dyn-traits-implement-the-trait">Today: Dyn traits implement the trait</h3>
<p>In Rust today, assuming you have a “dyn-safe” trait <code>DoTheThing </code>, then the type <code>dyn DoTheThing </code> implements <code>Trait</code>. Consider this trait:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">DoTheThing</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">	</span><span class="k">fn</span> <span class="nf">do_the_thing</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">DoTheThing</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">String</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">do_the_thing</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="fm">println!</span><span class="p">(</span><span class="err">“</span><span class="p">{}</span><span class="err">”</span><span class="p">,</span><span class="w"> </span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>And now imagine some generic function that uses the trait:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">some_generic_fn</span><span class="o">&lt;</span><span class="n">T</span>: <span class="o">?</span><span class="nb">Sized</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">DoTheThing</span><span class="o">&gt;</span><span class="p">(</span><span class="n">t</span>: <span class="kp">&amp;</span><span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">	</span><span class="n">t</span><span class="p">.</span><span class="n">do_the_thing</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Naturally, we can call <code>some_generic_fn</code> with a <code>&amp;String</code>, but — because <code>dyn DoTheThing</code> implements <code>DoTheThing</code> — we can also call <code>some_generic_fn</code> with a <code>&amp;dyn DoTheThing</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">some_nongeneric_fn</span><span class="p">(</span><span class="n">x</span>: <span class="kp">&amp;</span><span class="nc">dyn</span><span class="w"> </span><span class="n">DoTheThing</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">some_generic_fn</span><span class="p">(</span><span class="n">x</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="dyn-safety-a-mini-retrospective">Dyn safety, a mini retrospective</h3>
<p>Early on in Rust, we debated whether <code>dyn DoTheThing</code> ought to implement the trait <code>DoTheThing</code> or not. This was, indeed, the origin of the term “dyn safe” (then called “object safe”). At the time, I argued in favor of the current approach: that is, creating a binary property. Either the trait was dyn safe, in which case <code>dyn DoTheThing</code> implements <code>DoTheThing</code>, or it was not, in which case <code>dyn DoTheThing</code> is not a legal type. I am no longer sure that was the right call.</p>
<p>What I liked at the time was the idea that, in this model, whenever you see a type like <code>dyn DoTheThing</code>, you know that you can use it like any other type that implements <code>DoTheThing</code>.</p>
<p>Unfortunately, in practice, the type <code>dyn DoTheThing</code> is not comparable to a type like <code>String</code>. Notably, <code>dyn</code> types are not sized, so you can’t pass them around by value or work with them like strings. You must instead always pass around some kind of <em>pointer</em> to them, such as a <code>Box&lt;dyn DoTheThing&gt;</code> or a <code>&amp;dyn DoTheThing</code>. This is “unusual” enough that we make you <em>opt-in</em> to it for generic functions, by writing <code>T: ?Sized</code>.</p>
<p>What this means is that, in practice, generic functions don’t accept <code>dyn</code> types “automatically”, you have to design <em>for</em> dyn explicitly. So a lot of the benefit I envisioned didn’t come to pass.</p>
<h3 id="static-versus-dynamic-dispatch-vtables">Static versus dynamic dispatch, vtables</h3>
<p>Let’s talk for a bit about dyn safety and where it comes from. To start, we need to explain the difference between <em>static dispatch</em> and <em>virtual (dyn) dispatch</em>. Simply put, static dispatch means that the compiler knows which function is being called, whereas dyn dispatch means that the compiler doesn’t know. In terms of the CPU itself, there isn’t much difference. With static dispatch, there is a “hard-coded” instruction that says “call the code at this address”<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>; with dynamic dispatch, there is an instruction that says “call the code whose address is in this variable”. The latter can be a bit slower but it hardly matters in practice, particularly with a successful prediction.</p>
<p>When you use a <code>dyn</code> trait, what you actually have is a <em>vtable</em>. You can think of a vtable as being a kind of struct that contains a collection of function pointers, one for each method in the trait. So the vtable type for the <code>DoTheThing</code> trait might look like (in practice, there is a bit of extra data, but this is close enough for our purposes):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">DoTheThingVtable</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">do_the_thing</span>: <span class="nc">fn</span><span class="p">(</span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="p">())</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Here the <code>do_the_thing</code> method has a corresponding field. Note that the type of the first argument <em>ought</em> to be <code>&amp;self</code>, but we changed it to <code>*mut ()</code>. This is because the whole idea of the vtable is that you don’t know what the <code>self</code> type is, so we just changed it to “some pointer” (which is all we need to know).</p>
<p>When you create a vtable, you are making an instance of this struct that is tailored to some particular type. In our example, the type <code>String</code> implements <code>DoTheThing</code>, so we might create the vtable for <code>String</code> like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">static</span><span class="w"> </span><span class="n">Vtable_DoTheThing_String</span>: <span class="kp">&amp;</span><span class="nc">DoTheThingVtable</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">DoTheThingVtable</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">do_the_thing</span>: <span class="o">&lt;</span><span class="nb">String</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="n">DoTheThing</span><span class="o">&gt;</span>::<span class="n">do_the_thing</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="k">fn</span><span class="p">(</span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="p">())</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//            Fully qualified reference to `do_the_thing` for strings
</span></span></span><span class="line"><span class="cl"><span class="p">};</span><span class="w">
</span></span></span></code></pre></div><p>You may have heard that a <code>&amp;dyn DoTheThing</code> type in Rust is a <em>wide pointer</em>. What that means is that, at runtime, it is actually a pair of <em>two</em> pointers: a data pointer and a vtable pointer for the <code>DoTheThing</code> trait. So <code>&amp;dyn DoTheThing</code> is roughly equivalent to:</p>
<pre tabindex="0"><code>(*mut (), &amp;’static DoTheThingVtable)
</code></pre><p>When you cast a <code>&amp;String</code> to a <code>&amp;dyn DoTheThing</code>, what actually happens at runtime is that the compiler takes the <code>&amp;String</code> pointer, casts it to <code>*mut ()</code>, and pairs it with the appropriate vtable. So, if you have some code like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">x</span>: <span class="kp">&amp;</span><span class="nb">String</span> <span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="err">”</span><span class="n">Hello</span><span class="p">,</span><span class="w"> </span><span class="n">Rustaceans</span><span class="err">”</span><span class="p">.</span><span class="n">to_string</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">y</span>: <span class="kp">&amp;</span><span class="nc">dyn</span><span class="w"> </span><span class="n">DoTheThing</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">x</span><span class="p">;</span><span class="w">
</span></span></span></code></pre></div><p>It winds up “desugared” to something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">x</span>: <span class="kp">&amp;</span><span class="nb">String</span> <span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="err">”</span><span class="n">Hello</span><span class="p">,</span><span class="w"> </span><span class="n">Rustaceans</span><span class="err">”</span><span class="p">.</span><span class="n">to_string</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">y</span>: <span class="p">(</span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="p">(),</span><span class="w"> </span><span class="o">&amp;</span><span class="err">’</span><span class="k">static</span><span class="w"> </span><span class="n">DoTheThingVtable</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="p">(),</span><span class="w"> </span><span class="n">Vtable_DoTheThing_String</span><span class="p">);</span><span class="w">
</span></span></span></code></pre></div><h3 id="the-dyn-impl">The dyn impl</h3>
<p>We’ve seen how you create wide pointers and how the compiler represents vtables. We’ve also seen that, in Rust, <code>dyn DoTheThing</code> implements <code>DoTheThing</code>. You might wonder how that works. Conceptually, the compiler generates an impl where each method in the trait is implemented by extracting the function pointer from the vtable and calling it:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">DoTheThing</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="k">dyn</span><span class="w"> </span><span class="n">DoTheThing</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">do_the_thing</span><span class="p">(</span><span class="bp">self</span>: <span class="kp">&amp;</span><span class="nc">dyn</span><span class="w"> </span><span class="n">DoTheThing</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// Remember that `&amp;dyn DoTheThing` is equivalent to
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// a tuple like `(*mut (), &amp;’static DoTheThingVtable)`:
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="p">(</span><span class="n">data_pointer</span><span class="p">,</span><span class="w"> </span><span class="n">vtable_pointer</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">function_pointer</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">vtable_pointer</span><span class="p">.</span><span class="n">do_the_thing</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">function_pointer</span><span class="p">(</span><span class="n">data_pointer</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>In effect, when we call a generic function like <code>some_generic_fn</code> with <code>T = dyn DoTheThing</code>, we monomorphize that call exactly like any other type. The call to <code>do_the_thing</code> is dispatched against the impl above, and it is <em>that special impl</em> that actually does the dynamic dispatch. Neat.</p>
<h3 id="static-dispatch-permits-monomorphization">Static dispatch permits monomorphization</h3>
<p>Now that we’ve seen how and when vtables are constructed, we can talk about the rules for dyn safety and where they come from. One of the most basic rules is that a trait is only dyn-safe if it contains no generic methods (or, more precisely, if its methods are only generic over lifetimes, not types). The reason for this rule derives directly from how a vtable works: when you construct a vtable, you need to give a single function pointer for each method in the trait (or, perhaps, a finite set of function pointers). The problem with generic methods is that there is no single function pointer for them: you need a different pointer for each type that they’re applied to. Consider this example trait, <code>PrintPrefixed</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">PrintPrefixed</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">prefix</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">String</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">apply</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">Display</span><span class="o">&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">t</span>: <span class="nc">T</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">PrintPrefixed</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">String</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">prefix</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">String</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">clone</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">apply</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">Display</span><span class="o">&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">t</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="fm">println!</span><span class="p">(</span><span class="err">“</span><span class="p">{}</span>: <span class="p">{}</span><span class="err">”</span><span class="p">,</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">t</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>What would a vtable for <code>String as PrintPrefixed</code> look like? Generating a function pointer for <code>prefix</code> is no problem, we can just use <code>&lt;String as PrintPrefixed&gt;::prefix</code>. But what about <code>apply</code>? We would have to include a function pointer for <code>&lt;String as PrintPrefixed&gt;::apply&lt;T&gt;</code>, but we don’t know yet what the <code>T</code> is!</p>
<p>In contrast, with static dispatch, we don’t have to know what <code>T</code> is until the point of call. In that case, we can generate just the copy we need.</p>
<h3 id="partial-dyn-impls">Partial dyn impls</h3>
<p>The previous point shows that a trait can have <em>some</em> methods that are dyn-safe and some methods that are not. In current Rust, this makes the entire trait be “not dyn safe”, and this is because there is no way for us to write a complete <code>impl PrintPrefixed for dyn PrintPrefixed</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">PrintPrefixed</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="k">dyn</span><span class="w"> </span><span class="n">PrintPrefixed</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">prefix</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">String</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// For `prefix`, no problem:
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">prefix_fn</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="cm">/* get prefix function pointer from vtable */</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">prefix_fn</span><span class="p">(</span><span class="err">…</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">apply</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">Display</span><span class="o">&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">t</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// For `apply`, we can’t handle all `T` types, what field to fetch?
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="fm">panic!</span><span class="p">(</span><span class="err">“</span><span class="n">No</span><span class="w"> </span><span class="n">way</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="n">implement</span><span class="w"> </span><span class="n">apply</span><span class="err">”</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Under the alternative design that was considered long ago, we could say that a <code>dyn PrintPrefixed</code> value is always legal, but <code>dyn PrintPrefixed</code> only implements the <code>PrintPrefixed</code> trait if all of its methods (and other items) are dyn safe. Either way, if you had a <code>&amp;dyn PrintPrefixed</code>, you could call <code>prefix</code>. You just wouldn’t be able to use a <code>dyn PrintPrefixed</code> with generic code like <code>fn foo&lt;T: ?Sized + PrintPrefixed&gt;</code>.</p>
<p>(We’ll return to this theme in future blog posts.)</p>
<p>If you’re familiar with the “special case” around trait methods that require <code>where Self: Sized</code>, you might be able to see where it comes from now. If a method has a <code>where Self: Sized</code> requirement, and we have an impl for a type like <code>dyn PrintPrefixed</code>, then we can see that this impl could never be called, and so we can omit the method from the impl (and vtable) altogether. This is awfully similar to saying that <code>dyn PrintPrefixed</code> is always legal, because it means that there only a subset of methods that can be used via virtual dispatch. The difference is that <code>dyn PrintPrefixed: PrintPrefixed</code> still holds, because we know that generic code won’t be able to call those “non-dyn-safe” methods, since generic code would have to require that <code>T: ?Sized</code>.</p>
<h3 id="associated-types-and-dyn-types">Associated types and dyn types</h3>
<p>We began this saga by talking about associated types and <code>dyn</code> types. In Rust today, a dyn type is required to specify a value for each associated type in the trait. For example, consider a simplified <code>Iterator</code> trait:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This trait is dyn safe, but if you actually have a <code>dyn</code> in practice, you would have to write something like <code>dyn Iterator&lt;Item = u32&gt;</code>. The <code>impl Iterator for dyn Iterator</code> looks like:</p>
<pre tabindex="0"><code>impl&lt;T&gt; Iterator for dyn Iterator&lt;Item = T&gt; {
    type Item = T;
    
    fn next(&amp;mut self) -&gt; Option&lt;T&gt; {
        let next_fn = /* get next function from vtable */;
        return next_fn(self);
    }
}
</code></pre><p>Now you can see why we require all the associated types to be part of the <code>dyn</code> type — it lets us write a complete impl (i.e., one that includes a value for each of the associated types).</p>
<h3 id="conclusion">Conclusion</h3>
<p>We covered a lot of background in this post:</p>
<ul>
<li>Static vs dynamic dispatch, vtables</li>
<li>The origin of dyn safety, and the possibility of “partial dyn safety”</li>
<li>The idea of a synthesized <code>impl Trait for dyn Trait</code></li>
</ul>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Modulo dynamic linking.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/dyn-async-traits" term="dyn-async-traits" label="Dyn async traits"/></entry><entry><title type="html">Dyn async traits, part 1</title><link href="https://smallcultfollowing.com/babysteps/blog/2021/09/30/dyn-async-traits-part-1/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2021/09/30/dyn-async-traits-part-1/</id><published>2021-09-30T00:00:00+00:00</published><updated>2021-09-30T10:50:00-04:00</updated><content type="html"><![CDATA[<p>Over the last few weeks, <a href="https://github.com/tmandry/">Tyler Mandry</a> and I have been digging hard into what it will take to implement async fn in traits. Per the <a href="https://lang-team.rust-lang.org/initiatives.html">new lang team initiative process</a>, we are collecting our design thoughts in an ever-evolving website, the <a href="https://rust-lang.github.io/async-fundamentals-initiative/">async fundamentals initiative</a>. If you&rsquo;re interested in the area, you should definitely poke around; you may be interested to read about the <a href="https://rust-lang.github.io/async-fundamentals-initiative/roadmap/mvp.html">MVP</a> that we hope to stabilize first, or the (very much WIP) <a href="https://rust-lang.github.io/async-fundamentals-initiative/evaluation.html">evaluation doc</a> which covers some of the challenges we are still working out. I am going to be writing a series of blog posts focusing on one particular thing that we have been talking through: the <a href="https://rust-lang.github.io/async-fundamentals-initiative/evaluation/challenges/dyn_traits.html">problem of <code>dyn</code> and <code>async fn</code></a>. This first post introduces the problem and the general goal that we are shooting for (but don&rsquo;t yet know the best way to reach).</p>
<h3 id="what-were-shooting-for">What we&rsquo;re shooting for</h3>
<p>What we want is simple. Imagine this trait, for &ldquo;async iterators&rdquo;:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">AsyncIter</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>We would like you to be able to write a trait like that, and to implement it in the obvious way:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">SleepyRange</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">start</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">stop</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">AsyncIter</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">SleepyRange</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kt">u32</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">tokio</span>::<span class="n">sleep</span><span class="p">(</span><span class="mi">1000</span><span class="p">).</span><span class="k">await</span><span class="p">;</span><span class="w"> </span><span class="c1">// just to await something :)
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">s</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">start</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">if</span><span class="w"> </span><span class="n">s</span><span class="w"> </span><span class="o">&lt;</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">stop</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="bp">self</span><span class="p">.</span><span class="n">start</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">s</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nb">Some</span><span class="p">(</span><span class="n">s</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nb">None</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>You should then be able to have a <code>Box&lt;dyn AsyncIter&lt;Item = u32&gt;&gt;</code> and use that in exactly the way you would use a <code>Box&lt;dyn Iterator&lt;Item = u32&gt;&gt;</code> (but with an <code>await</code> after each call to <code>next</code>, of course):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">b</span>: <span class="nb">Box</span><span class="o">&lt;</span><span class="k">dyn</span><span class="w"> </span><span class="n">AsyncIter</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kt">u32</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">..</span><span class="p">.;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">b</span><span class="p">.</span><span class="n">next</span><span class="p">().</span><span class="k">await</span><span class="p">;</span><span class="w">
</span></span></span></code></pre></div><h3 id="desugaring-to-an-associated-type">Desugaring to an associated type</h3>
<p>Consider this running example:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">AsyncIter</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Here, the <code>next</code> method will desugar to a fn that returns <em>some</em> kind of future; you can think of it like a generic associated type:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">AsyncIter</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Next</span><span class="o">&lt;</span><span class="na">&#39;me</span><span class="o">&gt;</span>: <span class="nc">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="na">&#39;me</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Next</span><span class="o">&lt;</span><span class="nb">&#39;_</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The corresponding desugaring for the impl would use <a href="https://rust-lang.github.io/impl-trait-initiative/">type alias impl trait</a>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">SleepyRange</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">start</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">stop</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// Type alias impl trait:
</span></span></span><span class="line"><span class="cl"><span class="k">type</span> <span class="nc">SleepyRangeNext</span><span class="o">&lt;</span><span class="na">&#39;me</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">impl</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kt">u32</span><span class="o">&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="na">&#39;me</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">AsyncIter</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">InfinityAndBeyond</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kt">u32</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Next</span><span class="o">&lt;</span><span class="na">&#39;me</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">SleepyRangeNext</span><span class="o">&lt;</span><span class="na">&#39;me</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">SleepyRangeNext</span><span class="o">&lt;</span><span class="na">&#39;me</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">async</span><span class="w"> </span><span class="k">move</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">tokio</span>::<span class="n">sleep</span><span class="p">(</span><span class="mi">1000</span><span class="p">).</span><span class="k">await</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="kd">let</span><span class="w"> </span><span class="n">s</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">start</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="c1">// as above
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This desugaring works quite well for standard generics (or <code>impl Trait</code>). Consider this function:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">process</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="n">t</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">T</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">u32</span>
</span></span><span class="line"><span class="cl"><span class="nc">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">T</span>: <span class="nc">AsyncIter</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kt">u32</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">sum</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">while</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">x</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">t</span><span class="p">.</span><span class="n">next</span><span class="p">().</span><span class="k">await</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">sum</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="n">x</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">if</span><span class="w"> </span><span class="n">sum</span><span class="w"> </span><span class="o">&gt;</span><span class="w"> </span><span class="mi">22</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">break</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">sum</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This code will work quite nicely. For example, when you call <code>t.next()</code>, the resulting future will be of type <code>T::Next</code>. After monomorphization, the compiler will be able to resolve <code>&lt;SleepyRange as AsyncIter&gt;::Next</code> to the <code>SleepyRangeNext</code> type, so that the future is known exactly. In fact, crates like <a href="https://github.com/akiles/embassy">embassy</a> already use this desugaring, albeit manually and only on nightly.</p>
<h3 id="associated-types-dont-work-for-dyn">Associated types don&rsquo;t work for dyn</h3>
<p>Unfortunately, this desugaring causes problems when you try to use <code>dyn</code> values. Today, when you have <code>dyn AsyncIter</code>, you must specify the values for <em>all</em> associated types defined in <code>AsyncIter</code>. So that means that instead of <code>dyn AsyncIter&lt;Item = u32&gt;</code>, you would have to write something like</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">for</span><span class="o">&lt;</span><span class="na">&#39;me</span><span class="o">&gt;</span><span class="w"> </span><span class="k">dyn</span><span class="w"> </span><span class="n">AsyncIter</span><span class="o">&lt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kt">u32</span><span class="p">,</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Next</span><span class="o">&lt;</span><span class="na">&#39;me</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">SleepyRangeNext</span><span class="o">&lt;</span><span class="na">&#39;me</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="o">&gt;</span><span class="w">
</span></span></span></code></pre></div><p>This is clearly a non-starter from an ergonomic perspective, but is has an even more pernicious problem. The whole point of a <code>dyn</code> trait is to have a value where we don&rsquo;t know what the underlying type is. But specifying the value of <code>Next&lt;'me&gt;</code> as <code>SleepyRangeNext</code> means that there is <em>exactly one impl</em> that could be in use here. This <code>dyn</code> value <em>must</em> be a <code>SleepyRange</code>, since no other impl has that same future.</p>
<p><strong>Conclusion:</strong> For <code>dyn AsyncIter</code> to work, the future returned by <code>next()</code> must be <em>independent of the actual impl</em>. Furthermore, it must have a fixed size. In other words, it needs to be something like <code>Box&lt;dyn Future&lt;Output = u32&gt;&gt;</code>.</p>
<h3 id="how-the-async-trait-crate-solves-this-problem">How the <code>async-trait</code> crate solves this problem</h3>
<p>You may have used the <a href="https://crates.io/crates/async-trait"><code>async-trait</code></a> crate. It resolves this problem by not using an associated type, but instead desugaring to <code>Box&lt;dyn Future&gt;</code> types:</p>
<pre tabindex="0"><code class="language-rust=" data-lang="rust=">trait AsyncIter {
    type Item;

    fn next(&amp;mut self) -&gt; Box&lt;dyn Future&lt;Output = Self::Item&gt; + Send + &#39;me&gt;;
}
</code></pre><p>This has a few disadvantages:</p>
<ul>
<li>It forces a <code>Box</code> all the time, even when you are using <code>AsyncIter</code> with static dispatch.</li>
<li>The type as given above says that the resulting future <em>must</em> be <code>Send</code>. For other async fn, we use auto traits to analyze automatically whether the resulting future is send (it is <code>Send</code> it if it can be, in other words; we don&rsquo;t declare up front whether it <em>must</em> be).</li>
</ul>
<h3 id="conclusion-ideally-we-want-box-when-using-dyn-but-not-otherwise">Conclusion: Ideally we want <code>Box</code> when using <code>dyn</code>, but not otherwise</h3>
<p>So far we&rsquo;ve seen:</p>
<ul>
<li>If we desugar async fn to an associated type, it works well for generic cases, because we can resolve the future to precisely the right type.</li>
<li>But it doesn&rsquo;t work for doesn&rsquo;t work well for <code>dyn</code> trait, because the rules of Rust require that we specify the value of the associated type exactly. For <code>dyn</code> traits, we really want the returned future to be something like <code>Box&lt;dyn Future&gt;</code>.
<ul>
<li>Using <code>Box</code> does mean a slight performance penalty relative to static dispatch, because we must allocate the future dynamically.</li>
</ul>
</li>
</ul>
<p>What we would <em>ideally</em> want is to only pay the price of <code>Box</code> when using <code>dyn</code>:</p>
<ul>
<li>When you use <code>AsyncIter</code> in generic types, you get the desugaring shown above, with no boxing and static dispatch.</li>
<li>But when you create a <code>dyn AsyncIter</code>, the future type becomes <code>Box&lt;dyn Future&lt;Output = u32&gt;&gt;</code>.
<ul>
<li>(And perhaps you can choose another &ldquo;smart pointer&rdquo; type besides <code>Box</code>, but I&rsquo;ll ignore that for now and come back to it later.)</li>
</ul>
</li>
</ul>
<p>In upcoming posts, I will dig into some of the ways that we might achieve this.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/series/dyn-async-traits" term="dyn-async-traits" label="Dyn async traits"/></entry><entry><title type="html">Rustacean Principles, continued</title><link href="https://smallcultfollowing.com/babysteps/blog/2021/09/16/rustacean-principles-continued/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2021/09/16/rustacean-principles-continued/</id><published>2021-09-16T00:00:00+00:00</published><updated>2021-09-16T09:42:00-04:00</updated><content type="html"><![CDATA[<p>RustConf is always a good time for reflecting on the project. For me, the last week has been particularly &ldquo;reflective&rdquo;. Since announcing the <a href="https://rustacean-principles.netlify.app/">Rustacean Principles</a>, I&rsquo;ve been having a number of conversations with members of the community about how they can be improved. I wanted to write a post summarizing some of the feedback I&rsquo;ve gotten.</p>
<h3 id="the-principles-are-a-work-in-progress">The principles are a work-in-progress</h3>
<p>Sparking conversation about the principles was exactly what I was hoping for when I posted the previous blog post. The principles have mostly been the product of <a href="https://github.com/joshtriplett/">Josh</a> and I iterating, and hence reflect our experiences. While the two of us have been involved in quite a few parts of the project, for the document to truly serve its purpose, it needs input from the community as a whole.</p>
<p>Unfortunately, for many people, the way I presented the principles made it seem like I was trying to unveil a <em>fait accompli</em>, rather than seeking input on a work-in-progress. I hope this post makes the intention more clear!</p>
<h3 id="the-principles-as-a-continuation-of-rusts-traditions">The principles as a continuation of Rust&rsquo;s traditions</h3>
<p>Rust has a long tradition of articulating its values. This is why we have a <a href="https://www.rust-lang.org/policies/code-of-conduct">Code of Conduct</a>. This is why we wrote blog posts like <a href="https://blog.rust-lang.org/2015/04/10/Fearless-Concurrency.html">Fearless Concurrency</a>, <a href="https://blog.rust-lang.org/2014/10/30/Stability.html">Stability as a Deliverable</a> and <a href="https://blog.rust-lang.org/2015/04/24/Rust-Once-Run-Everywhere.html">Rust Once, Run Anywhere</a>. Looking past the &ldquo;engineering side&rdquo; of Rust, aturon&rsquo;s classic blog posts on listening and trust (<a href="http://aturon.github.io/tech/2018/05/25/listening-part-1/">part 1</a>, <a href="http://aturon.github.io/tech/2018/06/02/listening-part-2/">part 2</a>, <a href="http://aturon.github.io/tech/2018/06/18/listening-part-3/">part 3</a>) did a great job of talking about what it is like to be on a Rust team. And who could forget the whole <a href="https://brson.github.io/fireflowers/">&ldquo;fireflowers&rdquo;</a> debate?<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<p><strong>My goal with the Rustacean Principles is to help <em>coalesce</em> the existing wisdom found in those classic Rust blog posts into a more concise form.</strong> To that end, I took initial inspiration from how AWS uses <a href="https://aws.amazon.com/blogs/enterprise-strategy/tenets-provide-essential-guidance-on-your-cloud-journey/">tenets</a>, although by this point the principles have evolved into a somewhat different form. I like the way tenets use short, crisp statements that identify important concepts, and I like the way assigning a priority ordering helps establish which should have priority. (That said, one of Rust&rsquo;s oldest values is <em>synthesis</em>: we try to find ways to resolve constraints that are in tension by having our cake and eating it too.)</p>
<p>Given all of this backdrop, I was pretty enthused by a suggestion that I heard from <a href="https://github.com/Eh2406">Jacob Finkelman</a>. He suggested adapting the principles to incorporate more of the &ldquo;classic Rust catchphrases&rdquo;, such as the &ldquo;no new rationale&rdquo; rule described in the <a href="http://aturon.github.io/tech/2018/05/25/listening-part-1/">first blog post from aturon&rsquo;s series</a>. A similar idea is to incorporate the lessons from RFCs, both successful and unsuccessful (this is what I was going for in the <a href="https://rustacean-principles.netlify.app/case_studies.html">case studies</a> section, but that clearly needs to be expanded).</p>
<h3 id="the-overall-goal-empowerment">The overall goal: Empowerment</h3>
<p>My original intention was to structure the principles as a cascading series of ideas:</p>
<ul>
<li><strong>Rust&rsquo;s top-level goal:</strong> <em>Empowerment</em>
<ul>
<li><strong>Principles:</strong> Dissecting empowerment into its constituent pieces &ndash; reliable, performant, etc &ndash; and analyzing the importance of those pieces relative to one another.
<ul>
<li><strong>Mechanisms:</strong> Specific rules that we use, like <a href="https://rustacean-principles.netlify.app/how_rust_empowers/reliable/type_safety.html">type safety</a>, that engender the principles (reliability, performance, etc.). These mechanisms often work in favor of one principle, but can work against others.</li>
</ul>
</li>
</ul>
</li>
</ul>
<p><a href="https://twitter.com/wycats/">wycats</a> suggested that the site could do a better job of clarifying that <em>empowerment</em> is the top-level, overriding goal, and I agree. I&rsquo;m going to try and tweak the site to make it clearer.</p>
<h3 id="a-goal-not-a-minimum-bar">A goal, not a minimum bar</h3>
<p>The principles in <a href="https://rustacean-principles.netlify.app/how_to_rustacean.html">&ldquo;How to Rustacean&rdquo;</a> were meant to be aspirational: a target to be reaching for. We&rsquo;re all human: nobody does everything right all the time. But, as <a href="https://internals.rust-lang.org/t/blog-post-rustacean-principles/15300/2?u=nikomatsakis">Matklad describes</a>, the principles could be understood as setting up a kind of minimum bar &ndash; to be a team member, one has to <a href="https://rustacean-principles.netlify.app/how_to_rustacean/show_up.html">show up</a>, <a href="https://rustacean-principles.netlify.app/how_to_rustacean/follow_through.html">follow through</a>, <a href="https://rustacean-principles.netlify.app/how_to_rustacean/trust_and_delegate.html">trust and delegate</a>, all while <a href="https://rustacean-principles.netlify.app/how_to_rustacean/bring_joy.html">bringing joy</a>? This could be really stressful for people.</p>
<p>The goal for the <a href="https://rustacean-principles.netlify.app/how_to_rustacean.html">&ldquo;How to Rustacean&rdquo;</a> section is to be a way to lift people up by giving them clear guidance for how to succeed; it helps us to answer people when they ask &ldquo;what should I do to get onto the lang/compiler/whatever team&rdquo;. The internals thread had a number of good ideas for how to help it serve this intended purpose without stressing people out, such as <a href="https://internals.rust-lang.org/t/blog-post-rustacean-principles/15300/6?u=nikomatsakis">cuviper&rsquo;s suggestion to use fictional characters like Ferris in examples</a>, <a href="https://internals.rust-lang.org/t/blog-post-rustacean-principles/15300/9?u=nikomatsakis">passcod&rsquo;s suggestion of discussing inclusion</a>, or Matklad&rsquo;s proposal to <a href="https://internals.rust-lang.org/t/blog-post-rustacean-principles/15300/2?u=nikomatsakis">add something to the effect of &ldquo;You don&rsquo;t have to be perfect&rdquo;</a> to the list. Iteration needed!</p>
<h3 id="scope-of-the-principles">Scope of the principles</h3>
<p>Some people have wondered why the principles are framed in a rather general way, one that applies to all of Rust, instead of being specific to the lang team. It&rsquo;s a fair question! In fact, they didn&rsquo;t start this way. They started their life as a rather narrow set of <a href="https://github.com/rust-lang/wg-async-foundations/blob/a109db290e99bcc9c1705e477694c2301ec7f658/src/vision/tenets.md">&ldquo;design tenets for async&rdquo;</a> that appeared in the <a href="https://rust-lang.github.io/wg-async-foundations/vision.html">async vision doc</a>. But as those evolved, I found that they were starting to sound like design goals for Rust as a whole, not specifically for async.</p>
<p>Trying to describe Rust as a &ldquo;coherent whole&rdquo; makes a lot of sense to me. After all, the experience of using Rust is shaped by all of its facets: the language, the libraries, the tooling, the community, even its internal infrastructure (which contributes to that feeling of reliability by ensuring that the releases are available and high quality). Every part has its own role to play, but they are all working towards the same goal of empowering Rust&rsquo;s users.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></p>
<p>There is an interesting question about the long-term trajectory for this work. In my mind, the principles remain something of an experiment. Presuming that they prove to be useful, I think that they would make a nice RFC.</p>
<h3 id="what-about-easy">What about &ldquo;easy&rdquo;?</h3>
<p>One final bit of feedback I heard from <a href="https://github.com/carllerche/">Carl Lerche</a> is surprise that the principles don&rsquo;t include the word &ldquo;easy&rdquo;. This not an accident. I felt that &ldquo;easy to use&rdquo; was too subjective to be actionable, and that the goals of <a href="https://rustacean-principles.netlify.app/how_rust_empowers/productive.html">productive</a> and <a href="https://rustacean-principles.netlify.app/how_rust_empowers/supportive.html">supportive</a> were more precise. However, I do think that for people to feel empowered, it&rsquo;s important for them not feel mentally overloaded, and Rust can definitely have the problem of carrying a high mental load sometimes.</p>
<p>I&rsquo;m not sure the best way to tweak the <a href="https://rustacean-principles.netlify.app/how_rust_empowers.html">&ldquo;Rust empowers by being&hellip;&rdquo;</a> section to reflect this, but the answer may lie with the <a href="https://en.wikipedia.org/wiki/Cognitive_dimensions_of_notations">Cognitive Dimensions of Notation</a>. I was introduced to these from <a href="https://twitter.com/Felienne">Felienne Herman</a>&rsquo;s excellent book <a href="https://www.manning.com/books/the-programmers-brain">The Programmer&rsquo;s Brain</a>; I quite enjoyed <a href="https://www.sciencedirect.com/science/article/abs/pii/S1045926X96900099?via%3Dihub">this journal article</a> as well.</p>
<p>The idea of the <a href="https://en.wikipedia.org/wiki/Cognitive_dimensions_of_notations">CDN</a> is to try and elaborate on the <em>ways</em> that tools can be easier or harder to use for a particular task. For example, Rust would likely do well on the &ldquo;error prone&rdquo; dimension, in that when you make changes, the compiler generally helps ensure they are correct. But Rust does tend to have a high &ldquo;viscosity&rdquo;, because making local changes tends to be difficult: adding a lifetime, for example, can require updating data structures all over the code in an annoying cascade.</p>
<p>It&rsquo;s important though to keep in mind that the <a href="https://en.wikipedia.org/wiki/Cognitive_dimensions_of_notations">CDN</a> will vary from task to task. There are many kinds of changes one can make in Rust with very low viscosity, such as adding a new dependency. On the other hand, there are also cases where Rust can be error prone, such as <a href="https://rust-lang.github.io/wg-async-foundations/vision/submitted_stories/status_quo/alan_started_trusting_the_rust_compiler_but_then_async.html">mixing async runtimes</a>.</p>
<h3 id="conclusion">Conclusion</h3>
<p>In retrospect, I wish I had introduced the concept of the Rustacean Principles in a different way. But the subsequent conversations have been really great, and I&rsquo;m pretty excited by all the ideas on how to improve them. I want to encourage folks again to come over to the <a href="https://internals.rust-lang.org/t/blog-post-rustacean-principles/15300">internals thread</a> with their thoughts and suggestions.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Love that web page, <a href="https://github.com/brson">brson</a>.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>One interesting question: I do think that some tools may vary the prioritization of different aspects of Rust. For example, a tool for formal verification is obviously aimed at users that <em>particularly</em> value reliability, but other tools may have different audiences. I&rsquo;m not sure yet the best way to capture that, it may well be that each tool can have its own take on the way that it particularly empowers.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/aic" term="aic" label="AiC"/></entry><entry><title type="html">CTCFT 2021-09-20 Agenda</title><link href="https://smallcultfollowing.com/babysteps/blog/2021/09/15/ctcft-2021-09-20-agenda/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2021/09/15/ctcft-2021-09-20-agenda/</id><published>2021-09-15T00:00:00+00:00</published><updated>2021-09-15T09:45:00-04:00</updated><content type="html"><![CDATA[<p><img src="https://raw.githubusercontent.com/rust-ctcft/ctcft/main/img/camprust.png" width="222" style="float:left;"/> The next <a href="https://rust-ctcft.github.io/ctcft/">&ldquo;Cross Team Collaboration Fun Times&rdquo; (CTCFT)</a> meeting will take place next Monday, on 2021-09-20 (<a href="https://everytimezone.com/s/6f28d1ba">in your time zone</a>)! This post covers the agenda. You&rsquo;ll find the full details (along with a calendar event, zoom details, etc) <a href="https://rust-ctcft.github.io/ctcft/meetings/2021-09-20.html">on the CTCFT website</a>.</p>
<div style="clear:both;"></div>
<h3 id="agenda">Agenda</h3>
<ul>
<li>Announcements</li>
<li>Interest group panel discussion</li>
</ul>
<p>We&rsquo;re going to try something a bit different this time! The agenda is going to focus on Rust interest groups and domain working groups, those brave explorers who are trying to put Rust to use on all kinds of interesting domains. Rather than having fixed presentations, we&rsquo;re going to have a panel discussion with representatives from a number of Rust interest groups and domain groups, led by <a href="https://github.com/AngelOnFira">AngelOnFira</a>. The idea is to open a channel for communication about how to have more active communication and feedback between interest groups and the Rust teams (in both directions).</p>
<h3 id="afterwards-social-hour">Afterwards: Social hour</h3>
<p>After the CTCFT this week, we are going to try an experimental <strong>social hour</strong>. The hour will be coordinated in the #ctcft stream of the rust-lang Zulip. The idea is to create breakout rooms where people can gather to talk, hack together, or just chill.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/ctcft" term="ctcft" label="CTCFT"/></entry><entry><title type="html">Rustacean Principles</title><link href="https://smallcultfollowing.com/babysteps/blog/2021/09/08/rustacean-principles/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2021/09/08/rustacean-principles/</id><published>2021-09-08T00:00:00+00:00</published><updated>2021-09-08T00:00:00+00:00</updated><content type="html"><![CDATA[<p>As the <a href="https://www.rust-lang.org/">web site</a> says, Rust is a <em>language empowering everyone to build reliable and efficient software</em>. I think it&rsquo;s precisely this feeling of <em>empowerment</em> that people love about Rust. As <a href="https://github.com/wycats/">wycats</a> put it recently to me, Rust makes it &ldquo;feel like things are possible that otherwise feel out of reach&rdquo;. But what exactly makes Rust feel that way? If we can describe it, then we can use that description to help us improve Rust, and to guide us as we design extensions to Rust.</p>
<p>Besides the language itself, Rust is also an open-source community, one that prides itself on our ability to do collaborative design. But what do we do which makes us able to work well together? If we can describe <em>that</em>, then we can use those descriptions to help ourselves improve, and to instruct new people on how to better work within the community.</p>
<p>This blog post describes a project I and others have been working on called <a href="https://rustacean-principles.netlify.app/">the Rustacean principles</a>. This project is an attempt to enumerate the (heretofore implicit) principles that govern both Rust&rsquo;s design and the way our community operates. <strong>The principles are still in draft form</strong>; for the time being, they live in the <a href="https://github.com/nikomatsakis/rustacean-principles">nikomatsakis/rustacean-principles</a> repository.</p>
<!-- more -->
<h3 id="how-the-principles-got-started">How the principles got started</h3>
<p>The Rustacean Principles were suggested by <a href="https://foundation.rust-lang.org/posts/2021-04-15-introducing-shane-miller/">Shane</a> during a discussion about how we can grow the Rust organization while keeping it true to itself. Shane pointed out that, at AWS, mechanisms like <a href="https://aws.amazon.com/blogs/enterprise-strategy/tenets-provide-essential-guidance-on-your-cloud-journey/">tenets</a> and the <a href="https://www.amazon.jobs/en/principles">leadership principles</a> are used to communicate and preserve shared values.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> The goal at AWS, as in the Rust org, is to have teams that operate independently but which still wind up &ldquo;itching in the same direction&rdquo;, as <a href="https://github.com/aturon/">aturon</a> <a href="https://youtu.be/J9OFQm8Qf1I?t=1312">so memorably put it</a>.</p>
<p>Since that initial conversation, the principles have undergone quite some iteration. The initial effort, which I <a href="https://youtu.be/ksSuXNmGZNA?t=2001">presented</a> at the <a href="https://rust-ctcft.github.io/ctcft/">CTCFT</a> on <a href="https://rust-ctcft.github.io/ctcft/meetings/2021-06-21.html">2021-06-21</a>, were quite closely modeled on AWS tenets. After a number of in-depth conversations with both <a href="https://github.com/joshtriplett/">joshtriplett</a> and <a href="https://github.com/aturon/">aturon</a>, though, I wound up evolving the structure quite a bit to what you see today. I expect them to continue evolving, particularly the section on what it means to be a team member, which has received less attention.</p>
<h3 id="rust-empowers-by-being">Rust empowers by being&hellip;</h3>
<p>The <a href="https://rustacean-principles.netlify.app/">principles</a> are broken into two main sections. The first describes Rust&rsquo;s particular way of empowering people. This description comes in the form of a list of <em>properties</em> that we are shooting for:</p>
<ul>
<li><a href="https://rustacean-principles.netlify.app/how_rust_empowers.html">Rust empowers by being&hellip;</a>
<ul>
<li>⚙️ <a href="https://rustacean-principles.netlify.app/how_rust_empowers/reliable.html">Reliable</a>: &ldquo;if it compiles, it works&rdquo;</li>
<li>🐎 <a href="https://rustacean-principles.netlify.app/how_rust_empowers/performant.html">Performant</a>: &ldquo;idiomatic code runs efficiently&rdquo;</li>
<li>🥰 <a href="https://rustacean-principles.netlify.app/how_rust_empowers/supportive.html">Supportive</a>: &ldquo;the language, tools, and community are here to help&rdquo;</li>
<li>🧩 <a href="https://rustacean-principles.netlify.app/how_rust_empowers/productive.html">Productive</a>: &ldquo;a little effort does a lot of work&rdquo;</li>
<li>🔧 <a href="https://rustacean-principles.netlify.app/how_rust_empowers/transparent.html">Transparent</a>: &ldquo;you can predict and control low-level details&rdquo;</li>
<li>🤸 <a href="https://rustacean-principles.netlify.app/how_rust_empowers/versatile.html">Versatile</a>: &ldquo;you can do anything with Rust&rdquo;</li>
</ul>
</li>
</ul>
<p>These properties are frequently in tension with one another. Our challenge as designers is to find ways to satisfy all of these properties at once. In some cases, though, we may be forced to decide between slightly penalizing one goal or another. In that case, we tend to give the edge to those goals that come earlier in the list over those that come later. Still, while the ordering is important, it&rsquo;s important to emphasize that for Rust to be successful we need to achieve <strong>all of these feelings at once</strong>.</p>
<p>Each of the properties has a page that describes it in more detail. The page also describes some specific <strong>mechanisms</strong> that we use to achieve this property. These mechanisms take the form of more concrete rules that we apply to Rust&rsquo;s design. For example, the page for <a href="https://rustacean-principles.netlify.app/how_rust_empowers/reliable.html">reliability</a> discusses <a href="https://rustacean-principles.netlify.app/how_rust_empowers/reliable/type_safety.html">type safety</a>, <a href="https://rustacean-principles.netlify.app/how_rust_empowers/reliable/consider_all_cases.html">consider all cases</a>, and several other mechanisms. The discussion gives concrete examples of the tradeoffs at play and some of the techniques we have used to mitigate them.</p>
<p>One thing: these principles are meant to describe more than just the language. For example, one example of Rust being <a href="https://rustacean-principles.netlify.app/how_rust_empowers/supportive.html">supportive</a> are the great <a href="https://rustacean-principles.netlify.app/how_rust_empowers/supportive/polished.html">error messages</a>, and Cargo&rsquo;s lock files and dependency system are geared towards making Rust feel <a href="https://rustacean-principles.netlify.app/how_rust_empowers/reliable.html">reliable</a>.</p>
<h3 id="how-to-rustacean">How to Rustacean</h3>
<p>Rust has been an open source project since its inception, and over time we have evolved and refined the way that we operate. One key concept for Rust are the <a href="https://www.rust-lang.org/governance">governance teams</a>, whose members are responsible for decisions regarding Rust&rsquo;s design and maintenance. We definitely have a notion of what it means &ldquo;to Rustacean&rdquo; &ndash; there are specific behaviors that we are looking for. But it has historically been really challenging to define them, and in turn to help people to achieve them (or to recognize when we ourselves are falling short!). The next section of this site, <a href="https://rustacean-principles.netlify.app/how_to_rustacean.html">How to Rustacean</a>, is a first attempt at drafting just such a list. You can think of it like a companion to the <a href="https://www.rust-lang.org/policies/code-of-conduct">Code of Conduct</a>: whereas the <a href="https://www.rust-lang.org/policies/code-of-conduct">CoC</a> describes the bare minimum expected of any Rust participant, the <a href="https://rustacean-principles.netlify.app/how_to_rustacean.html">How to Rustacean</a> section describes what it means to excel.</p>
<ul>
<li><a href="https://rustacean-principles.netlify.app/how_to_rustacean.html">How to Rustacean</a>
<ul>
<li>💖 <a href="https://rustacean-principles.netlify.app/how_to_rustacean/be_kind.html">Be kind and considerate</a></li>
<li>✨ <a href="https://rustacean-principles.netlify.app/how_to_rustacean/bring_joy.html">Bring joy to the user</a></li>
<li>👋 <a href="https://rustacean-principles.netlify.app/how_to_rustacean/show_up.html">Show up</a></li>
<li>🔭 <a href="https://rustacean-principles.netlify.app/how_to_rustacean/recognize_others.html">Recognize others&rsquo; knowledge</a></li>
<li>🔁 <a href="https://rustacean-principles.netlify.app/how_to_rustacean/start_somewhere.html">Start somewhere</a></li>
<li>✅ <a href="https://rustacean-principles.netlify.app/how_to_rustacean/follow_through.html">Follow through</a></li>
<li>🤝 <a href="https://rustacean-principles.netlify.app/how_to_rustacean/pay_it_forward.html">Pay it forward</a></li>
<li>🎁 <a href="https://rustacean-principles.netlify.app/how_to_rustacean/trust_and_delegate.html">Trust and delegate</a></li>
</ul>
</li>
</ul>
<p>This section of the site has undergone less iteration than the &ldquo;Rust empowerment&rdquo; section. The idea is that each of these principles has a dedicated page that elaborates on the principle and gives examples of it in action. The example of <a href="https://rustacean-principles.netlify.app/how_to_rustacean/show_up/raising_an_objection.html">Raising an objection about a design</a> (from <a href="https://rustacean-principles.netlify.app/how_to_rustacean/show_up.html">Show up</a>) is the most developed and a good one to look at to get the idea. One interesting bit is the &ldquo;goldilocks&rdquo; structure<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>, which indicates what it means to &ldquo;show up&rdquo; too little but also what it means to &ldquo;show up&rdquo; <em>too much</em>.</p>
<h3 id="how-the-principles-can-be-used">How the principles can be used</h3>
<p>For the principles to be a success, they need to be more than words on a website. I would like to see them become something that we actively reference all the time as we go about our work in the Rust org.</p>
<p>As an example, we were recently wrestling with a minor point about the semantics of closures in Rust 2021. The details aren&rsquo;t that important (<a href="https://github.com/rust-lang/project-rfc-2229/blob/master/design-doc-closure-capture-drop-copy-structs.md">you can read them here, if you like</a>), but the decision ultimately came down to a question of whether to adapt the rules so that they are smarter, but more complex. I think it would have been quite useful to refer to these principles in that discussion: ultimately, I think we chose to (slightly) favor <a href="https://rustacean-principles.netlify.app/how_rust_empowers/productive.html">productivity</a> at the expense of <a href="https://rustacean-principles.netlify.app/how_rust_empowers/transparent.html">transparency</a>, which aligns well with the ordering on the site. Further, as I noted in <a href="https://github.com/rust-lang/project-rfc-2229/blob/master/design-doc-closure-capture-drop-copy-structs.md#nikos-conclusion">my conclusion</a>, I would personally like to see some form of <a href="https://zulip-archive.rust-lang.org/stream/213817-t-lang/topic/capture.20clauses.html">explicit capture clause</a> for closures, which would give users a way to ensure total <a href="https://rustacean-principles.netlify.app/how_rust_empowers/transparent.html">transparency</a> in those cases where it is most important.</p>
<p>The <a href="https://rustacean-principles.netlify.app/how_to_rustacean.html">How to Rustacean</a> section can be used in a number of ways. One thing would be cheering on examples of where someone is doing a great job: <a href="https://github.com/m-ou-se/">Mara</a>&rsquo;s <a href="https://github.com/rust-lang/rust/issues/88623">issue celebrating all the contributions to the 2021 Edition</a> is a great instance of <a href="https://rustacean-principles.netlify.app/how_to_rustacean/pay_it_forward.html">paying it forward</a>, for example, and I would love it if we had a precise vocabulary for calling that out.</p>
<p>Another time these principles can be used is when looking for new candidates for team membership. When considering a candidate, we can look to see whether we can give concrete examples of times they have exhibited these qualities. We can also use the principles to give feedback to people about where they need to improve. I&rsquo;d like to be able to tell people who are interested in joining a Rust team, &ldquo;Well, I&rsquo;ve noticed you do a great job of <a href="https://rustacean-principles.netlify.app/how_to_rustacean/show_up.html">showing up</a>, but your designs tend to get mired in complexity. I think you should work on <a href="https://rustacean-principles.netlify.app/how_to_rustacean/start_somewhere.html">start somewhere</a>.&rdquo;</p>
<p>&ldquo;Hard conversations&rdquo; where you tell someone what they can do better are something that mangers do (or try to do&hellip;) in companies, but which often get sidestepped or avoided in an open source context. I don&rsquo;t claim to be an expert, but I&rsquo;ve found that having structure can help to take away the &ldquo;sting&rdquo; and make it easier for people to hear and learn from the feedback.<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></p>
<h3 id="what-comes-next">What comes next</h3>
<p>I think at this point the principles have evolved enough that it makes sense to get more widespread feedback. I&rsquo;m interested in hearing from people who are active in the Rust community about whether they reflect what you love about Rust (and, if not, what might be changed). I also plan to try and use them to guide both design discussions and questions of team membership, and I encourage others in the Rust teams to do the same. If we find that they are useful, then I&rsquo;d like to see them turned into an RFC and ultimately living on forge or somewhere more central.</p>
<h3 id="questions">Questions?</h3>
<p>I&rsquo;ve opened an <a href="https://internals.rust-lang.org/t/blog-post-rustacean-principles/15300">internals thread</a> for discussion.</p>
<h3 id="footnotes">Footnotes</h3>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>One of the first things that our team did at Amazon was to draft <a href="https://aws.amazon.com/blogs/opensource/how-our-aws-rust-team-will-contribute-to-rusts-future-successes/">its own tenets</a>; the discussion helped us to clarify what we were setting out to do and how we planned to do it.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Hat tip to <a href="https://twitter.com/marcjbrooker">Marc Brooker</a>, who suggested the &ldquo;Goldilocks&rdquo; structure, based on how the <a href="https://www.amazon.jobs/en/principles">Leadership Principles</a> are presented in the AWS wiki.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Speaking of which, one glance at my queue of assigned PRs make it clear that I need to work on my <a href="https://rustacean-principles.netlify.app/how_to_rustacean/follow_through.html">follow through</a>.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/aic" term="aic" label="AiC"/></entry><entry><title type="html">Next CTCFT Meeting: 2021-09-20</title><link href="https://smallcultfollowing.com/babysteps/blog/2021/08/30/next-ctcft-meeting-2021-09-20/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2021/08/30/next-ctcft-meeting-2021-09-20/</id><published>2021-08-30T00:00:00+00:00</published><updated>2021-08-30T15:30:00-04:00</updated><content type="html"><![CDATA[<p>Hold the date! The next <a href="https://rust-ctcft.github.io/ctcft/">Cross Team Collaboration Fun Times</a> meeting will be <a href="https://rust-ctcft.github.io/ctcft/meetings/2021-09-20.html">2021-09-20</a>. We&rsquo;ll be using the &ldquo;Asia-friendly&rdquo; time slot of <a href="https://everytimezone.com/s/6f28d1ba">21:00 EST</a>.</p>
<h3 id="what-will-the-talks-be-about">What will the talks be about?</h3>
<p>A detailed agenda will be announced in a few weeks. Current thinking however is to center the agenda on Rust interest groups and domain working groups, those brave explorers who are trying to put Rust to use on all kinds of interesting domains, such as <a href="https://gamedev.rs/">game development</a>, <a href="https://github.com/The-DevX-Initiative/RCIG_Coordination_Repo">cryptography</a>, <a href="https://github.com/rust-ml/wg">machine learning</a>, <a href="https://github.com/rust-formal-methods/">formal verification</a>, and <a href="https://github.com/rust-embedded/wg">embedded development</a>. If you run an interest group and I didn&rsquo;t list your group here, perhaps you want to get in touch! We&rsquo;ll be talking about how these groups operate and how we can do a better job of connecting interest groups with the Rust org.</p>
<h3 id="will-there-be-a-social-hour">Will there be a social hour?</h3>
<p>Absolutely! The social hour has been an increasingly popular feature of the CTCFT meeting. It will take place after the meeting (<a href="https://everytimezone.com/s/c9e1dd2f">22:00 EST</a>).</p>
<h3 id="how-can-i-get-this-on-my-calendar">How can I get this on my calendar?</h3>
<p>The CTCFT meetings are announced on this <a href="https://calendar.google.com/calendar/u/0?cid=NnU1cnJ0Y2U2bHJ0djA3cGZpM2RhbWdqdXNAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ">google calendar</a>.</p>
<h3 id="wait-what-about-august">Wait, what about August?</h3>
<p>Perceptive readers will note that there was no CTCFT meeting in August. That&rsquo;s because I and many others were on vacation. =)</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/ctcft" term="ctcft" label="CTCFT"/></entry><entry><title type="html">CTCFT 2021-07-19 Agenda</title><link href="https://smallcultfollowing.com/babysteps/blog/2021/07/12/ctcft-2021-07-19-agenda/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2021/07/12/ctcft-2021-07-19-agenda/</id><published>2021-07-12T00:00:00+00:00</published><updated>2021-07-12T00:00:00+00:00</updated><content type="html"><![CDATA[<p><img src="https://raw.githubusercontent.com/rust-ctcft/ctcft/main/img/camprust.png" width="222" style="float:left;"/> The next <a href="https://rust-ctcft.github.io/ctcft/">&ldquo;Cross Team Collaboration Fun Times&rdquo; (CTCFT)</a> meeting will take place one week from today, on 2021-07-19 (<a href="https://everytimezone.com/s/0b504718">in your time zone</a>)! What follows are the abstracts for the talks we have planned. You&rsquo;ll find the full details (along with a calendar event, zoom details, etc) <a href="https://rust-ctcft.github.io/ctcft/meetings/2021-07-19.html">on the CTCFT website</a>.</p>
<div style="clear:both;"></div>
<h3 id="mentoring">Mentoring</h3>
<p><em>Presented by: <a href="https://github.com/doc-jones/">doc-jones</a></em></p>
<p>The Rust project has a number of mechanisms for getting people involved in the project, but most are oriented around 1:1 engagement. Doc has been investigating some of the ways that other projects engage contributors, such as Python&rsquo;s <a href="https://www.mentored-sprints.dev/">mentored sprints</a>. She will discuss how some of those projects run things and share some ideas about how that might be applied in the Rust project.</p>
<h3 id="lang-team-initiative-process">Lang team initiative process</h3>
<p><em>Presented by: <a href="https://github.com/joshtriplett/">joshtriplett</a></em></p>
<p>The lang team recently established a new process we call <a href="https://github.com/rust-lang/lang-team/pull/105"><em>initiatives</em></a>. This is a refinement of the RFC process to include more explicit staging. Josh will talk about the new process, what motivated it, and how we&rsquo;re trying to build more sustainable processes.</p>
<h3 id="driving-discussions-via-postmortem-analysis">Driving discussions via postmortem analysis</h3>
<p><em>Presented by: TBD</em></p>
<p>Innovation means taking risks, and risky behavior sometimes leads to process failures. An example of a recent process failure was the Rust 1.52.0 release, and subsequent <a href="https://blog.rust-lang.org/2021/05/10/Rust-1.52.1.html">1.52.1 patch release</a> that followed a few days later. Every failure presents an opportunity to learn from our mistakes and correct our processes going forward. In response to the 1.52.0 event, the compiler team recently went through a <a href="https://hackmd.io/DhKzaRUgTVGSmhW8Mj0c8A">&ldquo;course correction&rdquo; postmortem process</a> inspired by the &ldquo;Correction of Error&rdquo; reviews that pnkfelix has observed at Amazon. This talk describes the structure of a formal postmortem, and discusses how other Rust teams might deploy similar postmortem activities for themselves.</p>
<h3 id="afterwards-social-hour">Afterwards: Social hour</h3>
<p>After the CTCFT this week, we are going to try an experimental <strong>social hour</strong>. The hour will be coordinated in the #ctcft stream of the rust-lang Zulip. The idea is to create breakout rooms where people can gather to talk, hack together, or just chill.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/ctcft" term="ctcft" label="CTCFT"/></entry><entry><title type="html">CTCFT Social Hour</title><link href="https://smallcultfollowing.com/babysteps/blog/2021/06/18/ctcft-social-hour/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2021/06/18/ctcft-social-hour/</id><published>2021-06-18T00:00:00+00:00</published><updated>2021-06-18T00:00:00+00:00</updated><content type="html"><![CDATA[<p>Hey everyone! At the <a href="https://rust-ctcft.github.io/ctcft/meetings/2021-06-21.html">CTCFT meeting this Monday (2021-06-21)</a>, we&rsquo;re going to try a &ldquo;social hour&rdquo;. The idea is really simple: for the hour after the meeting, we will create breakout rooms in Zoom with different themes. You can join any breakout room you like and hangout.</p>
<p>The themes for the breakout rooms will be based on suggestions. If you have an idea for a room you&rsquo;d like to try, you can post it in a <a href="https://rust-lang.zulipchat.com/#narrow/stream/286036-ctcft/topic/social.20hour.202021-06-21">dedicated topic on the #ctcft Zulip stream</a>. Or, if you see somebody else has posted an idea that you like, then add a 👍 emoji. We&rsquo;ll create the final breakout list based on what we see there.</p>
<p>The breakout rooms can be as casual or focused as you like. For example, we will have some default rooms for hanging out &ndash; please make suggestons for icebreaker topics on Zulip! We also plan to have some rooms where people are chatting while doing Rust work: for example, <a href="https://zulip-archive.rust-lang.org/286036ctcft/69346socialhour20210621.html#243077876">yaahc suggested</a> for folks who want to write mentoring instructions.</p>
<p>Also: a reminder that there is a <a href="https://calendar.google.com/calendar/u/0/embed?src=7n0vvoqfe0kbnk6i04uiu52t30@group.calendar.google.com&amp;ctz=America/New_York">CTCFT Calendar</a> that you can subscribe to to be reminded of future meetings. If you like, I can add you to the invite, just ask on Zulip or Discord.</p>
<p>See you there!</p>]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/ctcft" term="ctcft" label="CTCFT"/></entry><entry><title type="html">CTCFT 2021-06-21 Agenda</title><link href="https://smallcultfollowing.com/babysteps/blog/2021/06/14/ctcft-2021-06-21-agenda/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2021/06/14/ctcft-2021-06-21-agenda/</id><published>2021-06-14T00:00:00+00:00</published><updated>2021-06-14T00:00:00+00:00</updated><content type="html"><![CDATA[<p><img src="https://raw.githubusercontent.com/rust-ctcft/ctcft/main/img/camprust.png" width="222" style="float:left;"/> The second <a href="https://rust-ctcft.github.io/ctcft/">&ldquo;Cross Team Collaboration Fun Times&rdquo; (CTCFT)</a> meeting will take place one week from today, on 2021-06-21 (<a href="https://everytimezone.com/s/5f09e412">in your time zone</a>)! This post describes the main agenda items for the meeting; you&rsquo;ll find the full details (along with a calendar event, zoom details, etc) <a href="https://rust-ctcft.github.io/ctcft/meetings/2021-06-21.html">on the CTCFT website</a>.</p>
<div style="clear:both;"></div>
<h3 id="afterwards-social-hour">Afterwards: Social hour</h3>
<p>After the CTCFT this week, we are going to try an experimental <strong>social hour</strong>. The hour will be coordinated in the #ctcft stream of the rust-lang Zulip. The idea is to create breakout rooms where people can gather to talk, hack together, or just chill.</p>
<h3 id="turbowish-and-tokio-console">Turbowish and Tokio console</h3>
<p><em>Presented by: <a href="https://github.com/pnkfelix/">pnkfelix</a> and <a href="https://github.com/hawkw/">Eliza (hawkw)</a></em></p>
<p>Rust programs are known for being performant and correct &ndash; but what about when that&rsquo;s not true? Unfortunately, the state of the art for Rust tooling today can often be a bit difficult. This is particularly true for Async Rust, where users need insights into the state of the async runtime so that they can resolve deadlocks and tune performance. This talk discuss what top-notch debugging and tooling for Rust might look like. One particularly exciting project in this area is <a href="https://github.com/tokio-rs/console">tokio-console</a>, which lets users visualize the state of projects build on the tokio library.</p>
<h3 id="guiding-principles-for-rust">Guiding principles for Rust</h3>
<p><em>Presented by: <a href="https://github.com/nikomatsakis/">nikomatsakis</a></em></p>
<p>As Rust grows, we need to ensure that it retains a coherent design. Establishing a set of &ldquo;guiding principles&rdquo; is one mechanism for doing that. Each principle captures a goal that Rust aims to achieve, such as ensuring correctness, or efficiency. The principles give us a shared vocabulary to use when discussing designs, and they are ordered so as to give guidance in resolving tradeoffs. This talk will walk through a draft set of guiding principles for Rust that <a href="https://github.com/nikomatsakis/">nikomatsakis</a> has been working on, along with examples of how they those principles are enacted through Rust&rsquo;s language, library, and tooling.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/ctcft" term="ctcft" label="CTCFT"/></entry><entry><title type="html">Edition: the song</title><link href="https://smallcultfollowing.com/babysteps/blog/2021/05/26/edition-the-song/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2021/05/26/edition-the-song/</id><published>2021-05-26T00:00:00+00:00</published><updated>2021-05-26T00:00:00+00:00</updated><content type="html"><![CDATA[<p>You may have heard that <a href="https://blog.rust-lang.org/2021/05/11/edition-2021.html">the Rust 2021 Edition is coming</a>. Along with my daughter Daphne, I have recorded a little song in honor of the occasion! The full lyrics are below &ndash; if you feel inspired, please make your own version!<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> Enjoy!</p>
<h3 id="video">Video</h3>
<p>Watch the movie embedded here, or <a href="https://youtu.be/q0aNduqb2Ro">watch it on YouTube</a>:</p>
<div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
      <iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share; fullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube.com/embed/q0aNduqb2Ro?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"></iframe>
    </div>

<h3 id="lyrics">Lyrics</h3>
<p>(Spoken)<br/>
Breaking changes where no code breaks.<br/>
Sounds impossible, no?<br/>
But in the Rust language, you might say that we like to do impossible things.<br/>
It isn&rsquo;t easy.<br/>
You may ask, how do we manage such a thing?<br/>
That I can tell you in one word&hellip; Edition!<br/></p>
<p>(Chorus)<br/>
Edition, edition&hellip; edition!</p>
<p>(Lang)<br/>
Who day and night<br/>
Is searching for a change<br/>
Whatever they can do<br/>
So Rust&rsquo;s easier for you<br/>
Who sometimes finds<br/>
They have to tweak the rules<br/>
And change a thing or two in Rust?</p>
<p>(All)<br/>
The lang team, the lang team&hellip; edition!<br/>
The lang team, the lang team&hellip; edition!</p>
<p>(Libs)<br/>
Who designs the traits that we use each day?<br/>
All the time, in every way?<br/>
Who updates the prelude so that we can call<br/>
The methods that we want no sweat</p>
<p>(All)<br/>
The libs team, the libs team&hellip; edition!<br/>
The libs team, the libs team&hellip; edition!</p>
<p>(Users)<br/>
Three years ago I changed my code<br/>
to Rust twenty eighteen<br/>
Some dependencies did not<br/>
But they&hellip; kept working.</p>
<p>(All)<br/>
The users, the users&hellip; edition!<br/>
The users, the users&hellip; edition!</p>
<p>(Tooling)<br/>
And who does all this work<br/>
To patch and tweak and fix<br/>
Migrating all our code<br/>
Each edition to the next</p>
<p>(All)<br/>
The tooling, the tooling&hellip; edition!<br/>
The tooling, the tooling&hellip; edition!</p>
<p>(Spoken)<br/>
And here in Rust, we&rsquo;ve always had our little slogans.<br/>
For instance, abstraction&hellip; without overhead.<br/>
Concurrency&hellip; without data races.<br/>
Stability&hellip; without stagnation.<br/>
Hack&hellip; without fear.<br/>
But we couldn&rsquo;t do all of those things&hellip;<br/>
not without&hellip;<br/>
Edition!</p>
<h3 id="footnotes">Footnotes</h3>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>OMG, that would be amazing. I&rsquo;ll update the post with any such links I find.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><title type="html">CTCFTFTW</title><link href="https://smallcultfollowing.com/babysteps/blog/2021/05/14/ctcftftw/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2021/05/14/ctcftftw/</id><published>2021-05-14T00:00:00+00:00</published><updated>2021-05-14T00:00:00+00:00</updated><content type="html"><![CDATA[<p><a href="https://rust-ctcft.github.io/ctcft/meetings/2021-05-17.html">This Monday</a> I am starting something new: a monthly meeting called the <a href="https://rust-ctcft.github.io/ctcft/">&ldquo;Cross Team Collaboration Fun Times&rdquo; (CTCFT)</a><sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>. Check out our nifty logo<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>:</p>
<p><img src="https://raw.githubusercontent.com/rust-ctcft/ctcft/main/img/camprust.png" alt="Logo"></p>
<p>The meeting is a mechanism to help keep the members of the Rust teams in sync and in touch with one another. The idea is to focus on topics of broad interest (more than two teams):</p>
<ul>
<li>Status updates on far-reaching projects that could affect multiple teams;</li>
<li>Experience reports about people trying new things (sometimes succeeding, sometimes not);</li>
<li>&ldquo;Rough draft&rdquo; proposals that are ready to be brought before a wider audience.</li>
</ul>
<p>The meeting will focus on things that could either offer insights that might affect the work you&rsquo;re doing, or where the presenter would like to pose questions to the Rust teams and get feedback.</p>
<p>I announced the meeting some time back to <code>all@rust-lang.org</code>, but I wanted to make a broader announcement as well. <strong>This meeting is open for anyone to come and observe.</strong> This is by design. Even though the meeting is primarily meant as a forum for the members of the Rust teams, it can be hard to define the borders of a community like ours. I&rsquo;m hoping we&rsquo;ll get people who work on major Rust libraries in the ecosystem, for example, or who work on the various Rust teams that have come into being.</p>
<p>The first meeting is scheduled for <a href="https://everytimezone.com/s/675bc61f">2021-05-17 at 15:00 Eastern</a> and you will find the <a href="https://rust-ctcft.github.io/ctcft/meetings/2021-05-17.html">agenda</a> on the <a href="https://rust-ctcft.github.io/ctcft/">CTCFT website</a>, along with links to the <a href="https://hackmd.io/@rust-ctcft">slides</a> (still a work-in-progress as of this writing!). There is also a twitter account <a href="https://twitter.com/rustctcft">@RustCTCFT</a> and a <a href="https://calendar.google.com/calendar/embed?src=7n0vvoqfe0kbnk6i04uiu52t30%40group.calendar.google.com&amp;ctz=America%2FNew_York">Google calendar</a> that you can subscribe to.</p>
<p>I realize the limitations of a synchronous meeting. Due to the reality of time zones and a volunteer project, for example, we&rsquo;ll never be able to get all of Rust&rsquo;s global community to attend at once. <a href="https://rust-ctcft.github.io/ctcft/faq.html#what-can-we-do-to-make-this-accessible-to-people-around-the-globe">I&rsquo;ve designed the meeting to work well even if you can&rsquo;t attend</a>: the goal is have a place to <em>start</em> conversations, not to <em>finish</em> them. Agendas are annonunced well in advance and the meetings are recorded. We&rsquo;re also rotating times &ndash; the <a href="https://rust-ctcft.github.io/ctcft/meetings/2021-06-21.html">next meeting on 2021-06-21 takes place at 21:00 Eastern time</a>, for example.<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></p>
<p>Hope to see you there!</p>
<h3 id="footnotes">Footnotes</h3>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>In keeping with Rust&rsquo;s long-standing tradition of ridiculous acronyms.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Thanks to <a href="https://twitter.com/xfactor521">@Xfactor521</a>! 🙏&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>The agenda is still TBD. I&rsquo;ll tweet when we get it lined up. We&rsquo;re not announcing <em>that</em> far in advance! 😂&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/ctcft" term="ctcft" label="CTCFT"/></entry><entry><title type="html">[AiC] Vision Docs!</title><link href="https://smallcultfollowing.com/babysteps/blog/2021/05/01/aic-vision-docs/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2021/05/01/aic-vision-docs/</id><published>2021-05-01T00:00:00+00:00</published><updated>2021-05-01T00:00:00+00:00</updated><content type="html"><![CDATA[<p>The <a href="https://rust-lang.github.io/wg-async-foundations/vision.html">Async Vision Doc</a> effort has been going now for <a href="https://blog.rust-lang.org/2021/03/18/async-vision-doc.html">about 6 weeks</a>. It&rsquo;s been a fun ride, and I&rsquo;ve learned a lot. It seems like a good time to take a step back and start talking a bit about the vision doc structure and the process. In this post, I&rsquo;m going to focus on the role that I see vision docs playing in Rust&rsquo;s planning and decision making, particularly as compared to RFCs.</p>
<h3 id="vision-docs-frame-rfcs">Vision docs frame RFCs</h3>
<p>If you look at a description of the design process for a new Rust feature, it usually starts with &ldquo;write an RFC&rdquo;. After all, before we start work on something, we begin with an RFC that both motivates and details the idea. We then proceed to implementation and stabilization.</p>
<p>But the RFC process isn&rsquo;t really the beginning. The process really begins with identifying some sort of problem<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> &ndash; something that doesn&rsquo;t work, or which doesn&rsquo;t work as well as it could. The next step is imagining what you would like it to be like, and then thinking about how you could make that future into reality.</p>
<p>We&rsquo;ve always done this sort of &ldquo;framing&rdquo; when we work on RFCs. In fact, RFCs are often just one small piece of a larger picture. Think about something like <code>impl Trait</code>, which began with an intentionally conservative step (<a href="https://github.com/rust-lang/rfcs/pull/1522">RFC #1522</a>) and has been gradually extended. Async Rust started the same way; in that case, though, even the first RFC was split into two, which together described a complete first step (<a href="https://github.com/rust-lang/rfcs/pull/2394">RFC #2394</a> and <a href="https://github.com/rust-lang/rfcs/pull/2592">RFC #2592</a>).</p>
<p>The role of a vision doc is to take that implicit framing and make it explicit. Vision docs capture both the problem and the end-state that we hope to reach, and they describe the first steps we plan to take towards that end-state.</p>
<h3 id="the-shiny-future-of-vision-docs">The &ldquo;shiny future&rdquo; of vision docs</h3>
<p>There are many efforts within the Rust project that could benefit from vision docs. Think of long-running efforts like const generics or <a href="https://smallcultfollowing.com/babysteps/blog/2020/04/09/libraryification/">library-ification</a>. There is a future we are trying to make real, but it doesn&rsquo;t really exist in written form.</p>
<p>I can say that when the lang team is asked to approve an RFC relating to some incremental change in a long-running effort, it&rsquo;s very difficult for me to do. I need to be able to put that RFC into context. What is the latest plan we are working towards? How does this RFC take us closer? Sometimes there are parts of that plan that I have doubts about &ndash; does this RFC lock us in, or does it keep our options open? Having a vision doc that I could return to and evolve over time would be a tremendous boon.</p>
<p>I&rsquo;m also excited about the potential for &lsquo;interlocking&rsquo; vision docs. While working on the Async Vision Doc, for example, I&rsquo;ve found myself wanting to write examples that describe error handling. It&rsquo;d be really cool if I could pop over to the <a href="https://github.com/rust-lang/project-error-handling">Error Handling Project Group</a><sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup>, take a look at their vision doc, and then make use of what I see there in my own examples. It might even help me to identify a conflict before it happens.</p>
<h3 id="start-with-the-status-quo">Start with the &ldquo;status quo&rdquo;</h3>
<p>A key part of the vision doc is that it starts by documenting the <a href="https://rust-lang.github.io/wg-async-foundations/vision/status_quo.html">&ldquo;status quo&rdquo;</a>. It&rsquo;s all too easy to take the &ldquo;status quo&rdquo; for granted &ndash; to assume that everybody understands how things play out today.</p>
<p>When we started writing &ldquo;status quo&rdquo; stories, it was really hard to focus on the &ldquo;status quo&rdquo;. It&rsquo;s really tempting to jump straight to ideas for how to fix things. It took discipline to force ourselves to just focus on describing and understanding the current state.</p>
<p>I&rsquo;m really glad we did though. If you haven&rsquo;t done so already, take a moment to browse through the <a href="https://rust-lang.github.io/wg-async-foundations/vision/status_quo.html">status quo</a> section of the doc (you may find the <a href="https://rust-lang.github.io/wg-async-foundations/vision/status_quo.html#metanarrative">metanarrative</a> helpful to get an overview<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup>). Reading those stories has given me a much deeper understanding of how Async is working in practice, both at a technical level but also in terms of its impact on people. This is true even when presenting highly technical context. Consider stories like <a href="https://rust-lang.github.io/wg-async-foundations/vision/status_quo/barbara_builds_an_async_executor.html">Barbara builds an async executor</a> or <a href="https://rust-lang.github.io/wg-async-foundations/vision/status_quo/barbara_carefully_dismisses_embedded_future.html">Barbara carefully dismisses embedded future</a>. For me, stories like this have more resonance than just seeing a list of the technical obstacles one must overcome. They also help us talk about the various &ldquo;dead-ends&rdquo; that might otherwise get forgotten.</p>
<p>Those kind of dead-ends are especially important for people new to Rust, of course, who are likely to just give up and learn something else if the going gets too rough. In working on Rust, we&rsquo;ve always found that focusing on accessibility and the needs of new users is a great way to identify things that &ndash; once fixed &ndash; wind up helping everyone. It&rsquo;s interesting to think how long we put off doing NLL. After all, <a href="github.com/metajack">metajack</a> filed <a href="https://github.com/rust-lang/rust/issues/6393">#6393</a> in 2013, and I remember people raising it with me earlier. But to those of us who were experienced in Rust, we knew the workarounds, and it never seemed pressing, and hence NLL got put off until 2018.<sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup> But now it&rsquo;s clearly one of the most impactful changes we&rsquo;ve made to Rust for users at all levels.</p>
<h3 id="brainstorming-the-shiny-future">Brainstorming the &ldquo;shiny future&rdquo;</h3>
<p>A few weeks back, we <a href="https://blog.rust-lang.org/2021/04/14/async-vision-doc-shiny-future.html">started writing &ldquo;shiny future&rdquo; stories</a> (in addition to &ldquo;status quo&rdquo;). The &ldquo;shiny future&rdquo; stories are the point where we try to imagine what Rust could be like in a few years.</p>
<p>Ironically, although in the beginning the &ldquo;shiny future&rdquo; was all we could think about, getting a lot of &ldquo;shiny future&rdquo; stories up and posted has been rather difficult. It turns out to be hard to figure out what the future should look like!<sup id="fnref:7"><a href="#fn:7" class="footnote-ref" role="doc-noteref">7</a></sup></p>
<p>Writing &ldquo;shiny future&rdquo; stories sounds a bit like an RFC, but it&rsquo;s actually quite different:</p>
<ul>
<li>The focus is on the end user experience, not the details of how it works.</li>
<li>We want to think a bit past what we know how to do. The goal is to &ldquo;shake off&rdquo; the limits of incremental improvement and look for ways to really improve things in a big way.</li>
<li>We&rsquo;re not making commitments. This is a brainstorming session, so it&rsquo;s fine to have multiple contradictory shiny futures.</li>
</ul>
<p>In a way, it&rsquo;s like writing <em>just</em> the &ldquo;guide section&rdquo; of an RFC, except that it&rsquo;s not written as a manual but in narrative form.</p>
<h3 id="collaborative-writing-sessions">Collaborative writing sessions</h3>
<p>To try and make the writing process more fun, we started running <a href="https://smallcultfollowing.com/babysteps/blog/2021/04/26/async-vision-doc-writing-sessions-vii/">collaborative Vision Doc Writing Sessions</a>. We were focused purely on status quo stories at the time. The idea was simple &ndash; find people who had used Rust and get them to talk about their experiences. At the end of the session, we would have a &ldquo;nearly complete&rdquo; outline of a story that we could hand off to someone to finish.<sup id="fnref:8"><a href="#fn:8" class="footnote-ref" role="doc-noteref">8</a></sup></p>
<p>The sessions work particularly well when you are telling the story of people who were actually in the session. Then you can simply ask them questions to find out what happened. How did you start? What happened next? How did you feel then? Did you try anything else in between? If you&rsquo;re working from blog posts, you sometimes have to take guesses and try to imagine what might have happened.<sup id="fnref:9"><a href="#fn:9" class="footnote-ref" role="doc-noteref">9</a></sup></p>
<p>One thing to watch out for: I&rsquo;ve noticed people tend to jump steps when they narrate. They&rsquo;ll say something like &ldquo;so then I decided to use <code>FuturesUnordered</code>&rdquo;, but it&rsquo;s interesting to find out how they made that decision. How did they learn about <code>FuturesUnordered</code>? Those details will be important later, because if you develop some superior alternative, you have to be sure people will find it.</p>
<h3 id="shifting-to-the-shiny-future">Shifting to the &ldquo;shiny future&rdquo;</h3>
<p>Applying the &ldquo;collaborative writing session&rdquo; idea to the shiny future has been more difficult. If you get a bunch of people in one session, they may not agree on what the future should be like.</p>
<p>Part of the trick is that, with shiny future, you often want to go for breadth rather than depth. It&rsquo;s not just about writing one story, it&rsquo;s about exploring the design space. That leads to a different style of writing session, but you wind up with a scattershot set of ideas, not with a &rsquo;nearly complete&rsquo; story, and it&rsquo;s hard to hand those off.</p>
<p>I&rsquo;ve got a few ideas of things I would like to try when it comes to future writing sessions. One of them is that I would like to work directly with various luminaries from the Async Rust world to make sure their point-of-view is represented in the doc.</p>
<p>Another idea is to try and encourage more &ldquo;end-to-end&rdquo; stories that weave together the &ldquo;most important&rdquo; substories and give a sense of prioritization. After all, we know that <a href="https://rust-lang.github.io/wg-async-foundations/vision/status_quo/barbara_battles_buffered_streams.html">there are subtle footguns in the model as is</a> and we also know that <a href="https://rust-lang.github.io/wg-async-foundations/vision/status_quo/alan_has_an_event_loop.html">intgrating into external event loops is tricky</a>. Ideally, we&rsquo;d fix both. But which is a bigger obstacle to Async Rust users? In fact, I imagine that there is no single answer. The answer will depend on what people are doing with Async Rust.</p>
<h3 id="after-brainstorming-consolidating-the-doc-and-building-a-roadmap">After brainstorming: Consolidating the doc and building a roadmap</h3>
<p>The brainstorming period is scheduled to end mid-May. At that point comes the next phase, which is when we try to sort out all the contradictory shiny future stories into one coherent picture. I envision this process being led by the async working group leads (tmandry and I), but it&rsquo;s going to require a lot of consensus building as well.</p>
<p>In addition to building up the shiny future, part of this process will be deciding a concrete roadmap. The roadmap will describe the specific first steps we will take first towards this shiny future. The roadmap items will correspond to particular designs and work items. And here, with those specific work items, is where we get to RFCs: when those work items call for new stdlib APIs or extensions to the language, we will write RFCs that specify them. But those RFCs will be able to reference the vision doc to explain their motivation in more depth.</p>
<h3 id="living-document-adjusting-the-shiny-future-as-we-go">Living document: adjusting the &ldquo;shiny future&rdquo; as we go</h3>
<p>There is one thing I want to emphasize: <strong>the &ldquo;shiny future&rdquo; stories we write today will be wrong</strong>. As we work on those first steps that appear in the roadmap, we are going to learn things. We&rsquo;re going to realize that the experience we wanted to build is not possible &ndash; or perhaps that it&rsquo;s not even desirable! That&rsquo;s fine. We&rsquo;ll adjust the vision doc periodically as we go. We&rsquo;ll figure out the process for that when the time comes, but I imagine it may be a similar &ndash; but foreshortened &ndash; version of the one we have used to draft the initial version.</p>
<h3 id="conclusion">Conclusion</h3>
<p>Ack! It&rsquo;s probably pretty obvious that I&rsquo;m excited about the potential for vision docs. I&rsquo;ve got a lot of things I want to say about them, but this post is getting pretty long. There are a lot of interesting questions to poke at, most of which I don&rsquo;t know the answers to yet. Some of the things on my mind: what are the best roles for the characters and should we tweak how they are defined<sup id="fnref:10"><a href="#fn:10" class="footnote-ref" role="doc-noteref">10</a></sup>? Can we come up with good heuristics for which character to use for which story? How are the &ldquo;consolidation&rdquo; and &ldquo;iteration / living document&rdquo; phases going to work? When is the appropriate time to write a vision doc &ndash; right away, or should you wait until you&rsquo;ve done enough work to have a clearer picture of what the future looks like? Are there lighterweight versions of the process? We&rsquo;re going to figure these things out as we go, and I will write some follow-up posts talking about them.</p>
<h3 id="footnotes">Footnotes</h3>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Not problem, opportunity!&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>And &ndash; heck &ndash; we&rsquo;re still working towards <a href="https://github.com/rust-lang/polonius/">Polonius</a>!&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Not my actual reason. I don&rsquo;t know my actual reason, it just seems right.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>Shout out to the error handling group, they&rsquo;re doing great stuff!&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>Did I mention we have <strong>34 stories</strong> so far (and more in open PRs)? So cool. Keep &rsquo;em coming!&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p>To be fair, it was also because designing and implementing NLL was really, really hard.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:7">
<p>Who knew?&#160;<a href="#fnref:7" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:8">
<p>Big, big shout-out to <a href="https://rust-lang.github.io/wg-async-foundations/acknowledgments.html#-participating-in-an-writing-session">all those folks who have participated</a>, and  especially those <a href="https://rust-lang.github.io/wg-async-foundations/acknowledgments.html#-directly-contributing">brave souls who authored stories</a>.&#160;<a href="#fnref:8" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:9">
<p>One thing that&rsquo;s great, though, is that after you post the story, you can <a href="https://github.com/rust-lang/wg-async-foundations/pull/172#issuecomment-826156660">ping people</a> and ask them if you got it right. =)&#160;<a href="#fnref:9" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:10">
<p>I feel pretty strongly that four characters is the right number (<a href="https://en.wikipedia.org/wiki/File:Fantastic_Four_2015_poster.jpg#/media/File:Fantastic_Four_2015_poster.jpg">it worked for Marvel</a>, it will work for us!)<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>, but I&rsquo;m not sure if we got their setup right in other respects.&#160;<a href="#fnref:10" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><title type="html">Async Vision Doc Writing Sessions VII</title><link href="https://smallcultfollowing.com/babysteps/blog/2021/04/26/async-vision-doc-writing-sessions-vii/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2021/04/26/async-vision-doc-writing-sessions-vii/</id><published>2021-04-26T00:00:00+00:00</published><updated>2021-04-26T00:00:00+00:00</updated><content type="html"><![CDATA[<p>My week is very scheduled, so I am not able to host any public drafting sessions
this week &ndash; however, <a href="https://github.com/rylev/">Ryan Levick</a> will be hosting two sessions!</p>
<table>
  <thead>
      <tr>
          <th>When</th>
          <th>Who</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><a href="https://everytimezone.com/s/2e8907b6">Wed at 07:00 ET</a></td>
          <td>Ryan</td>
      </tr>
      <tr>
          <td><a href="https://everytimezone.com/s/6815593b">Fri at 07:00 ET</a></td>
          <td>Ryan</td>
      </tr>
  </tbody>
</table>
<p>If you&rsquo;re available and those stories sound like something that interests you, please join him! Just ping me or Ryan on Discord or Zulip and we&rsquo;ll send you the Zoom link. If you&rsquo;ve already joined a previous session, the link is the same as before.</p>
<h3 id="extending-the-schedule-by-two-weeks">Extending the schedule by two weeks</h3>
<p>We have previously set 2021-04-30 as the end-date, but I proposed in a recent PR to <a href="https://github.com/rust-lang/wg-async-foundations/pull/173">extend that end date to 2021-05-14</a>. We&rsquo;ve been learning how this whole vision doc thing works as we go, and I think it seems clear we&rsquo;re going to want more time to finish off status quo stories and write shiny future before we feel we&rsquo;ve really explored the design space.</p>
<h3 id="the-visionwhat">The vision&hellip;what?</h3>
<p>Never heard of the async vision doc? It&rsquo;s a new thing we&rsquo;re trying as part of the Async Foundations Working Group:</p>
<blockquote>
<p>We are launching a collaborative effort to build a shared <a href="https://rust-lang.github.io/wg-async-foundations/vision.html#-the-vision">vision document</a> for Async Rust. <strong>Our goal is to engage the entire community in a collective act of the imagination:</strong> how can we make the end-to-end experience of using Async I/O not only a pragmatic choice, but a <em>joyful</em> one?</p>
</blockquote>
<p><a href="https://blog.rust-lang.org/2021/03/18/async-vision-doc.html">Read the full blog post for more.</a></p>
]]></content></entry><entry><title type="html">Async Vision Doc Writing Sessions VI</title><link href="https://smallcultfollowing.com/babysteps/blog/2021/04/19/async-vision-doc-writing-sessions-vi/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2021/04/19/async-vision-doc-writing-sessions-vi/</id><published>2021-04-19T00:00:00+00:00</published><updated>2021-04-19T00:00:00+00:00</updated><content type="html"><![CDATA[<p>Ryan Levick and I are going to be hosting more Async Vision Doc Writing Sessions this week. We&rsquo;re not organized enough to have assigned topics yet, so I&rsquo;m just going to post the dates/times and we&rsquo;ll be tweeting about the particular topics as we go.</p>
<table>
  <thead>
      <tr>
          <th>When</th>
          <th>Who</th>
          <th></th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><a href="https://everytimezone.com/s/a0fb71ea">Wed at 07:00 ET</a></td>
          <td>Ryan</td>
          <td></td>
      </tr>
      <tr>
          <td><a href="https://everytimezone.com/s/7b83105a">Wed at 15:00 ET</a></td>
          <td>Niko</td>
          <td></td>
      </tr>
      <tr>
          <td><a href="https://everytimezone.com/s/6787fee2">Fri at 07:00 ET</a></td>
          <td>Ryan</td>
          <td></td>
      </tr>
      <tr>
          <td><a href="https://everytimezone.com/s/3068c5cd">Fri at 14:00 ET</a></td>
          <td>Niko</td>
          <td></td>
      </tr>
  </tbody>
</table>
<p>If you&rsquo;ve joined before, we&rsquo;ll be re-using the same Zoom link. If you haven&rsquo;t joined, then send a private message to one of us and we&rsquo;ll share the link. Hope to see you there!</p>
]]></content></entry><entry><title type="html">Async Vision Doc Writing Sessions V</title><link href="https://smallcultfollowing.com/babysteps/blog/2021/04/12/async-vision-doc-writing-sessions-v/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2021/04/12/async-vision-doc-writing-sessions-v/</id><published>2021-04-12T00:00:00+00:00</published><updated>2021-04-12T00:00:00+00:00</updated><content type="html"><![CDATA[<p>This is an exciting week for the vision doc! As of this week, we are starting to
draft &ldquo;shiny future&rdquo; stories, and we would like your help! (We are also still
working on status quo stories, so there is no need to stop working on those.)
There will be a blog post coming out on the main Rust blog soon with all the
details, but you can go to the <a href="https://rust-lang.github.io/wg-async-foundations/vision/how_to_vision/shiny_future.html">&ldquo;How to vision: Shiny future&rdquo;</a> page now.</p>
<p>This week, Ryan Levick and I are going to be hosting four Async
Vision Doc Writing Sessions. Here is the schedule:</p>
<table>
  <thead>
      <tr>
          <th>When</th>
          <th>Who</th>
          <th>Topic</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><a href="https://everytimezone.com/s/a0929910">Wed at 07:00 ET</a></td>
          <td>Ryan</td>
          <td>TBD</td>
      </tr>
      <tr>
          <td><a href="https://everytimezone.com/s/c8bf3225">Wed at 15:00 ET</a></td>
          <td>Niko</td>
          <td>Shiny future &ndash; <a href="https://rust-lang.github.io/wg-async-foundations/vision/status_quo/niklaus_simulates_hydrodynamics.html">Niklaus simulates hydrodynamics</a></td>
      </tr>
      <tr>
          <td><a href="https://everytimezone.com/s/a919950c">Fri at 07:00 ET</a></td>
          <td>Ryan</td>
          <td>TBD</td>
      </tr>
      <tr>
          <td><a href="https://everytimezone.com/s/7db78195">Fri at 14:00 ET</a></td>
          <td>Niko</td>
          <td>Shiny future &ndash; <a href="https://github.com/rust-lang/wg-async-foundations/issues/45">Portability across runtimes</a></td>
      </tr>
  </tbody>
</table>
<p>The idea for shiny future is to start by looking at the existing stories we
have and to imagine how they might go differently. To be quite honest,
I am not entirely how this is going to work, but we&rsquo;ll figure it out together.
It&rsquo;s going to be fun. =) Come join!</p>
]]></content></entry><entry><title type="html">Async Vision Doc Writing Sessions IV</title><link href="https://smallcultfollowing.com/babysteps/blog/2021/04/07/async-vision-doc-writing-sessions-iv/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2021/04/07/async-vision-doc-writing-sessions-iv/</id><published>2021-04-07T00:00:00+00:00</published><updated>2021-04-07T00:00:00+00:00</updated><content type="html"><![CDATA[<p>My week is very scheduled, so I am not able to host any public drafting sessions
this week &ndash; however, <a href="https://github.com/rylev/">Ryan Levick</a> will be hosting two sessions!</p>
<table>
  <thead>
      <tr>
          <th>When</th>
          <th>Who</th>
          <th>Topic</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><a href="https://everytimezone.com/s/e2dce418">Thu at 07:00 ET</a></td>
          <td>Ryan</td>
          <td>The need for Async Traits</td>
      </tr>
      <tr>
          <td><a href="https://everytimezone.com/s/7900bcf1">Fri at 07:00 ET</a></td>
          <td>Ryan</td>
          <td>Challenges from cancellation</td>
      </tr>
  </tbody>
</table>
<p>If you&rsquo;re available and those stories sound like something that interests you, please join him! Just ping me or Ryan on Discord or Zulip and we&rsquo;ll send you the Zoom link. If you&rsquo;ve already joined a previous session, the link is the same as before.</p>
<h3 id="sneak-peek-next-week">Sneak peek: Next week</h3>
<p>Next week, we will be holding more vision doc writing sessions. We are now going to expand the scope to go beyond &ldquo;status quo&rdquo; stories and cover &ldquo;shiny future&rdquo; stories as well. Keep your eyes peeled for a post on the Rust blog and further updates!</p>
<h3 id="the-visionwhat">The vision&hellip;what?</h3>
<p>Never heard of the async vision doc? It&rsquo;s a new thing we&rsquo;re trying as part of the Async Foundations Working Group:</p>
<blockquote>
<p>We are launching a collaborative effort to build a shared <a href="https://rust-lang.github.io/wg-async-foundations/vision.html#-the-vision">vision document</a> for Async Rust. <strong>Our goal is to engage the entire community in a collective act of the imagination:</strong> how can we make the end-to-end experience of using Async I/O not only a pragmatic choice, but a <em>joyful</em> one?</p>
</blockquote>
<p><a href="https://blog.rust-lang.org/2021/03/18/async-vision-doc.html">Read the full blog post for more.</a></p>
]]></content></entry><entry><title type="html">My "shiny future"</title><link href="https://smallcultfollowing.com/babysteps/blog/2021/04/02/my-shiny-future/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2021/04/02/my-shiny-future/</id><published>2021-04-02T00:00:00+00:00</published><updated>2021-04-02T00:00:00+00:00</updated><content type="html"><![CDATA[<p>I&rsquo;ve been working on the Rust project for just about ten years. The language has evolved radically in that time, and so has the project governance. When I first started, for example, we communicated primarily over the <a href="https://mail.mozilla.org/pipermail/rust-dev/">rust-dev</a> mailing list and the #rust IRC channel. I distinctly remember coming into the Mozilla offices<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> one day and <a href="https://github.com/brson">brson</a> excitedly telling me, &ldquo;There were almost a dozen people on the #rust IRC channel last night! Just chatting! About Rust!&rdquo; It&rsquo;s funny to think about that now, given the scale Rust is operating at today.</p>
<h2 id="scaling-the-project-governance">Scaling the project governance</h2>
<p>Scaling the governance of the project to keep up with its growing popularity has been a constant theme. The first step was when we created a core team (initially <a href="https://github.com/pcwalton">pcwalton</a>, <a href="https://github.com/brson">brson</a>, and I) to make decisions. We needed some kind of clear decision makers, but we didn&rsquo;t want to set up a single person as &ldquo;BDFL&rdquo;. We also wanted a mechanism that would allow us to include non-Mozilla employees as equals.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></p>
<p>Having a core team helped us move faster for a time, but we soon found that the range of RFCs being considered was too much for one team. We needed a way to expand the set of decision makers to include focused expertise from each area. To address these problems, <a href="https://github.com/aturon">aturon</a> and I created <a href="https://rust-lang.github.io/rfcs/1068-rust-governance.html">RFC 1068</a>, which expanded from a single &ldquo;core team&rdquo; into many Rust teams, each focused on accepting RFCs and managing a particular area.</p>
<p>As written, <a href="https://rust-lang.github.io/rfcs/1068-rust-governance.html">RFC 1068</a> described a central technical role for the core team<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>, but it quickly became clear that this wasn&rsquo;t necessary. In fact, it was a kind of hindrance, since it introduced unnecessary bottlenecks. In practice, the Rust teams operated quite independently from one another. This independence enabled us to move rapidly on improving Rust; the RFC process &ndash; <a href="https://mail.mozilla.org/pipermail/rust-dev/2014-March/008973.html">which we had introduced in 2014</a><sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup> &ndash; provided the &ldquo;checks and balances&rdquo; that kept teams on track.<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup> As the project grew further, new teams like the <a href="https://internals.rust-lang.org/t/announcing-the-release-team/6561">release team</a> were created to address dedicated needs.</p>
<p>The teams were scaling well, but there was still a bottleneck: most people who contributed to Rust were still doing so as volunteers, which ultimately limits the amount of time people can put in. This was a hard nut to crack<sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup>, but we&rsquo;ve finally seen progress this year, as more and more companies have been employing people to contribute to Rust. Many of them are forming entire teams for that purpose &ndash; including AWS, where I am working now. And of course I would be remiss not to mention the <a href="https://foundation.rust-lang.org/posts/2021-02-08-hello-world/">launch of the Rust Foundation</a> itself, which gives Rust a legal entity of its own and creates a forum where companies can pool resources to help Rust grow.</p>
<h2 id="my-own-role">My own role</h2>
<p>My own trajectory through Rust governance has kind of mirrored the growth of the project. I was an initial member of the core team, as I said, and after we landed <a href="https://rust-lang.github.io/rfcs/1068-rust-governance.html">RFC 1068</a> I became the lead of the compiler and language design teams. I&rsquo;ve been wearing these three hats until very recently.</p>
<p>In December, I decided to <a href="https://smallcultfollowing.com/babysteps/blog/2020/12/11/rotating-the-compiler-team-leads/">step back as lead of the compiler team</a>. I had a number of reasons for doing so, but the most important is that I want to ensure that the Rust project continues to scale and grow. For that to happen, we need to transition from one individual doing all kinds of roles to people focusing on those places where they can have the most impact.<sup id="fnref:7"><a href="#fn:7" class="footnote-ref" role="doc-noteref">7</a></sup></p>
<p><strong>Today I am announcing that I am stepping back from the Rust core team.</strong> I plan to focus all of my energies on my roles as lead of the language design team and tech lead of the <a href="https://aws.amazon.com/blogs/opensource/how-our-aws-rust-team-will-contribute-to-rusts-future-successes/">AWS Rust Platform team</a>.</p>
<h2 id="where-we-go-from-here">Where we go from here</h2>
<p>So now we come to my <a href="https://rust-lang.github.io/wg-async-foundations/vision/shiny_future.html">&ldquo;shiny future&rdquo;</a>. My goal, as ever, is to continue to help Rust pursue its vision of being an accessible systems language. Accessible to me means that we offer strong safety guarantees coupled with a focus on ergonomics and usability; it also means that we build a welcoming, inclusive, and thoughtful community. To that end, I expect to be doing more product initiatives like the <a href="https://blog.rust-lang.org/2021/03/18/async-vision-doc.html">async vision doc</a> to help Rust build a coherent vision for its future; I also expect to continue working on ways to <a href="https://github.com/rust-lang/lang-team/blob/master/design-meeting-minutes/2021-03-24-lang-team-organization.md">scale the lang team</a>, improve the RFC process, and help the teams function well.</p>
<p>I am so excited about all that we the Rust community have built. Rust has become a language that people not only use but that they love using. We&rsquo;ve innovated not only in the design of the language but in the design and approach we&rsquo;ve taken to our community. <a href="https://nikomatsakis.github.io/rust-latam-2019/#101">&ldquo;In case you haven&rsquo;t noticed&hellip;we&rsquo;re doing the impossible here people!&rdquo;</a> So here&rsquo;s to the next ten years!</p>
<hr>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Offices! Remember those? Actually, I&rsquo;ve been working remotely since 2013, so to be honest I barely do.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>I think the first non-Mozilla member of the core team was <a href="https://huonw.github.io/">Huon Wilson</a>, but I can&rsquo;t find any announcements about it. I did find this <a href="https://internals.rust-lang.org/t/rust-team-alumni/3784">very nicely worded post by Brian Andersion</a> about Huon&rsquo;s <em>departure</em> though. &ldquo;They live on in our hearts, and in our IRC channels.&rdquo; Brilliant.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>If you read <a href="https://rust-lang.github.io/rfcs/1068-rust-governance.html">RFC 1068</a>, for example, you&rsquo;ll see some language about the core team deciding what features to stabilize. I don&rsquo;t think this happened even once: it was immediately clear that the teams were better positioned to make this decision.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>The email makes this sound like a minor tweak to the process. Don&rsquo;t be fooled. It&rsquo;s true that people had always written &ldquo;RFCs&rdquo; to the mailing list. But they weren&rsquo;t mandatory, and there was no real process around &ldquo;accepting&rdquo; or &ldquo;rejecting&rdquo; them. The RFC process was a pretty radical change, more radical I think than we ourselves even realized. The best part of it was that it was not optional for anyone, including core developers.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>Better still, the RFC mechanism invites public feedback. This is important because no single team of people can really have expertise in the full range of considerations needed to design a language like Rust.&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p>If you look back at my Rust roadmap posts, you&rsquo;ll see that this has been a theme in <a href="https://smallcultfollowing.com/babysteps/blog/2018/01/09/rust2018/">every</a> <a href="https://smallcultfollowing.com/babysteps/blog/2019/01/07/rust-in-2019-focus-on-sustainability/">single</a> <a href="https://smallcultfollowing.com/babysteps/blog/2019/12/02/rust-2020/#many-are-stronger-than-one">one</a>.&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:7">
<p>I kind of love <a href="https://nikomatsakis.github.io/rust-latam-2019/#109">these three slides</a> from my Rust LATAM 2019 talk, which expressed the same basic idea, but from a different perspective.&#160;<a href="#fnref:7" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><title type="html">Async Vision Doc Writing Sessions III</title><link href="https://smallcultfollowing.com/babysteps/blog/2021/03/29/async-vision-doc-writing-sessions-iii/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2021/03/29/async-vision-doc-writing-sessions-iii/</id><published>2021-03-29T00:00:00+00:00</published><updated>2021-03-29T00:00:00+00:00</updated><content type="html"><![CDATA[<p><a href="https://github.com/rylev/">Ryan Levick</a> and I are hosting a number of public drafting sessions scheduled this week.
Some of them are scheduled early to cover a wider range of time zones.</p>
<table>
  <thead>
      <tr>
          <th>When</th>
          <th>Who</th>
          <th>Topic</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><a href="https://everytimezone.com/s/c9869917">Tue at 14:30 ET</a></td>
          <td>Niko</td>
          <td><a href="https://github.com/rust-lang/wg-async-foundations/issues/67">wrapping C++ async APIs in Rust futures</a> and other tales of interop</td>
      </tr>
      <tr>
          <td><a href="https://everytimezone.com/s/43be0ecf">Wed at 10:00 ET</a></td>
          <td>Niko</td>
          <td><a href="https://github.com/rust-lang/wg-async-foundations/issues/95">picking an HTTP library</a> and similar stories</td>
      </tr>
      <tr>
          <td><a href="https://everytimezone.com/s/f775361b">Wed at 15:00 ET</a></td>
          <td>Niko</td>
          <td><a href="https://github.com/rust-lang/wg-async-foundations/issues/107">structured concurrency and parallel data processing</a></td>
      </tr>
      <tr>
          <td><a href="https://everytimezone.com/s/e2dce418">Thu at 07:00 ET</a></td>
          <td>Ryan</td>
          <td><a href="https://github.com/rust-lang/wg-async-foundations/issues/76">debugging</a> and getting <a href="https://github.com/rust-lang/wg-async-foundations/issues/75">insights into running services</a></td>
      </tr>
      <tr>
          <td><a href="https://everytimezone.com/s/7900bcf1">Fri at 07:00 ET</a></td>
          <td>Ryan</td>
          <td><a href="https://github.com/rust-lang/wg-async-foundations/issues/105">lack of a polished common implementations of basic async helpers</a></td>
      </tr>
      <tr>
          <td><a href="https://everytimezone.com/s/92ce4ece">Fri at 14:30 ET</a></td>
          <td>Niko</td>
          <td><a href="https://github.com/rust-lang/wg-async-foundations/issues/54">bridging sync and async</a></td>
      </tr>
  </tbody>
</table>
<p>If you&rsquo;re available and those stories sound like something that interests you, please join us!
We&rsquo;re particlarly interested in having people join who have had related experiences, as the goal here is to capture the details from people who&rsquo;ve been there.</p>
<p>In some cases, it may be helpful if you&rsquo;ve had similar experiences but in other ecosystems:</p>
<ul>
<li>For example, people who&rsquo;ve used Kotlin&rsquo;s coroutines would be most welcome on the Wed sesssion discussing structured concurency.</li>
<li>Similarly, folks who have used debuggers for other sorts of async systems (such as node.js or C#) would probably have useful info to share on Ryan&rsquo;s Thusday session.</li>
</ul>
<p>If you would like to join, ping me or Ryan on Discord or Zulip and we&rsquo;ll send you the Zoom link. If you&rsquo;ve already joined a previous session, the link is the same as before.</p>
<h3 id="the-visionwhat">The vision&hellip;what?</h3>
<p>Never heard of the async vision doc? It&rsquo;s a new thing we&rsquo;re trying as part of the Async Foundations Working Group:</p>
<blockquote>
<p>We are launching a collaborative effort to build a shared <a href="https://rust-lang.github.io/wg-async-foundations/vision.html#-the-vision">vision document</a> for Async Rust. <strong>Our goal is to engage the entire community in a collective act of the imagination:</strong> how can we make the end-to-end experience of using Async I/O not only a pragmatic choice, but a <em>joyful</em> one?</p>
</blockquote>
<p><a href="https://blog.rust-lang.org/2021/03/18/async-vision-doc.html">Read the full blog post for more.</a></p>
]]></content></entry><entry><title type="html">Async Vision Doc Writing Sessions II</title><link href="https://smallcultfollowing.com/babysteps/blog/2021/03/25/async-vision-doc-writing-sessions-ii/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2021/03/25/async-vision-doc-writing-sessions-ii/</id><published>2021-03-25T00:00:00+00:00</published><updated>2021-03-25T00:00:00+00:00</updated><content type="html"><![CDATA[<p>I&rsquo;m scheduling two more public drafting sessions for tomorrow, Match 26th:</p>
<ul>
<li>On March 26th at 10am ET (<a href="https://everytimezone.com/s/66582106">click to see in your local timezone</a>), we will be working on writing a story about the challenges of writing a library that can be reused across many runtimes (<a href="https://github.com/rust-lang/wg-async-foundations/issues/45">rust-lang/wg-async-foundations#45</a>);</li>
<li>On March 26th at 2pm ET (<a href="https://everytimezone.com/s/206264ec">click to see in your local tomezone</a>), we will be working on writing a story about the difficulty of debugging and interpreting async stack traces (<a href="https://github.com/rust-lang/wg-async-foundations/issues/69">rust-lang/wg-async-foundations#69</a>).</li>
</ul>
<p>If you&rsquo;re available and have interest in one of those issues, please join us!
Just ping me on Discord or Zulip and I&rsquo;ll send you the Zoom link.</p>
<p>I also plan to schedule more sessions next week, so stay tuned!</p>
<h3 id="the-visionwhat">The vision&hellip;what?</h3>
<p>Never heard of the async vision doc? It&rsquo;s a new thing we&rsquo;re trying as part of the Async Foundations Working Group:</p>
<blockquote>
<p>We are launching a collaborative effort to build a shared <a href="https://rust-lang.github.io/wg-async-foundations/vision.html#-the-vision">vision document</a> for Async Rust. <strong>Our goal is to engage the entire community in a collective act of the imagination:</strong> how can we make the end-to-end experience of using Async I/O not only a pragmatic choice, but a <em>joyful</em> one?</p>
</blockquote>
<p><a href="https://blog.rust-lang.org/2021/03/18/async-vision-doc.html">Read the full blog post for more.</a></p>
]]></content></entry><entry><title type="html">Async Vision Doc Writing Sessions</title><link href="https://smallcultfollowing.com/babysteps/blog/2021/03/22/async-vision-doc-writing-sessions/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2021/03/22/async-vision-doc-writing-sessions/</id><published>2021-03-22T00:00:00+00:00</published><updated>2021-03-22T00:00:00+00:00</updated><content type="html"><![CDATA[<p>Hey folks! As part of the <a href="https://blog.rust-lang.org/2021/03/18/async-vision-doc.html">Async Vision Doc</a> effort,
I&rsquo;m planning on holding two public drafting sessions tomorrow, March 23rd:</p>
<ul>
<li>March 23rd at noon ET (<a href="https://everytimezone.com/s/4d25dc1a">click to see in your local timezone</a>)</li>
<li>March 23rd at 5pm ET (<a href="https://everytimezone.com/s/3efcf390">click to see in your local tomezone</a>)</li>
</ul>
<p>During these sessions, we&rsquo;ll be looking over the <a href="https://github.com/rust-lang/wg-async-foundations/issues?q=is%3Aopen+is%3Aissue+label%3Astatus-quo-story-ideas">status quo issues</a>
and writing a story or two! If you&rsquo;d like to join, ping me on Discord
or Zulip and I&rsquo;ll send you the Zoom link.</p>
<h3 id="the-visionwhat">The vision&hellip;what?</h3>
<p>Never heard of the async vision doc? It&rsquo;s a new thing we&rsquo;re trying as part of the Async Foundations Working Group:</p>
<blockquote>
<p>We are launching a collaborative effort to build a shared <a href="https://rust-lang.github.io/wg-async-foundations/vision.html#-the-vision">vision document</a> for Async Rust. <strong>Our goal is to engage the entire community in a collective act of the imagination:</strong> how can we make the end-to-end experience of using Async I/O not only a pragmatic choice, but a <em>joyful</em> one?</p>
</blockquote>
<p><a href="https://blog.rust-lang.org/2021/03/18/async-vision-doc.html">Read the full blog post for more.</a></p>
]]></content></entry><entry><title type="html">The more things change...</title><link href="https://smallcultfollowing.com/babysteps/blog/2020/12/30/the-more-things-change/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2020/12/30/the-more-things-change/</id><published>2020-12-30T00:00:00+00:00</published><updated>2020-12-30T00:00:00+00:00</updated><content type="html"><![CDATA[<p>I&rsquo;ve got an announcement to make. <strong>As of Jan 4th, I&rsquo;m starting at Amazon as the tech lead of their new Rust team.</strong> Working at Mozilla has been a great experience, but I&rsquo;m pretty excited about this change. It&rsquo;s a chance to help shape what I hope to be an exciting new phase for Rust, where we grow from a project with a single primary sponsor (Mozilla) to an industry standard, supported by a wide array of companies. It&rsquo;s also a chance to work with some pretty awesome people &ndash; both familiar faces from the Rust community<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> and some new folks. Finally, I&rsquo;m hoping it will be an opportunity for me to refocus my attention to some long-standing projects that I really want to see through.</p>
<h3 id="new-rust-teams-are-an-opportunity-but-we-have-to-do-it-right">New Rust teams are an opportunity, but we have to do it right</h3>
<p>The goal for Rust has always been to create a language that will be used and supported by companies throughout the industry. With the <a href="https://blog.rust-lang.org/2020/12/14/Next-steps-for-the-foundation-conversation.html">imminent launch</a> of the Rust Foundation as well as the formation of new Rust teams at <a href="https://aws.amazon.com/blogs/opensource/why-aws-loves-rust-and-how-wed-like-to-help/">Amazon</a>, <a href="https://twitter.com/ryan_levick/status/1171830191804551168">Microsoft</a>, and <a href="https://twitter.com/nadavrot/status/1319003839018614784?lang=en">Facebook</a>, we are seeing that dream come to fruition. I&rsquo;m very excited about this. This is a goal I&rsquo;ve been working towards for years, and it was a <a href="https://smallcultfollowing.com/babysteps/
/blog/2019/12/02/rust-2020/#shifting-the-focus-from-adoption-to-investment">particular focus of mine for 2020</a>.</p>
<p>That said, I&rsquo;ve talked to a number of people in the Rust community who feel nervous about this change. After all, we&rsquo;ve worked hard to build an open source organization that values curiosity, broad collaboration, and uplifting others. As more companies form Rust teams, there&rsquo;s a chance that some of that could be lost, even if everyone has the best of intentions. While we all want to see more people paid to work on Rust, that can also result in &ldquo;part time&rdquo; contributors feeling edged out.</p>
<h3 id="working-to-support-rust-and-its-community">Working to support Rust and its community</h3>
<p>One reason that I am excited to be joining the team at Amazon is that our scope is very simple: <strong>help make Rust the best it can be</strong>.</p>
<p>In my view, &ldquo;making Rust the best it can be&rdquo; means not only doing good work, but doing that work <strong>in concert with the rest of the Rust community</strong>. That means sharing in the &ldquo;maintenance work&rdquo; of open source: reviews, bug fixes, tracking down regressions, organizing meetings, that sort of thing. But it also means expanding and nurturing the Rust teams we&rsquo;re a part of. It&rsquo;s good to fix a bug. It&rsquo;s better to find a newcomer and mentor them to fix it, or to extend the <a href="https://rustc-dev-guide.rust-lang.org/">rustc-dev-guide</a> so that it covers the code that had the bug.</p>
<p>The ultimate goal should be free and open collaboration. We&rsquo;ll know the Amazon team setup is working well if it doesn&rsquo;t really matter if the people we&rsquo;re collaborating with work at Amazon or not.</p>
<h3 id="on-pluralism-and-the-rust-organization">On pluralism and the Rust organization</h3>
<p>I want to zoom out a bit to the broader picture. As I said in the intro, we are entering a new phase for Rust, one where there are multiple active Rust teams at different companies, all working as part of the greater Rust community to build and support Rust. This is something to celebrate. I think it will go a long way towards making Rust development more sustainable for everyone.</p>
<p>Even as we celebrate, it&rsquo;s worth recognizing that in many ways this exciting future is already here. Supporting Rust doesn&rsquo;t require forming a full-time Rust team. The Google <a href="https://fuchsia.dev/">Fuchsia team</a>, for example, has always made a point of not only using Rust but actively contributing to the community. <a href="https://ferrous-systems.com/">Ferrous Systems</a> has a number of folks who work within the Rust teams. In truth, there are a lot of employers who give their employees time to work on Rust &ndash; way too many to list, even if I knew all their names. Then we have companies like <a href="https://twitter.com/repi/status/1294987596146384897">Embark</a> and others that actively fund work on their dependencies (shout-out to <a href="https://crates.io/crates/cargo-fund">cargo-fund</a>, an awesome tool developed by the equally awesome <a href="https://github.com/acfoltzer">acfoltzer</a>, who &ndash; as it happens &ndash; works at Fastly, another company that has been an active supporter of Rust).</p>
<p>This kind of collaboration is exactly what we envisioned when we setup things like the Rust teams and the RFC process. The ultimate goal is to have a &ldquo;rich stew&rdquo; of people with different interests and backgrounds all contributing to Rust, helping to ensure that Rust works well for systems programming everywhere. In order to do that successfully, you need both a structure like the Rust org but also an &ldquo;open source whenever&rdquo;<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> setup that accommodates people with different amounts of availability, since the people you&rsquo;re trying to reach are not all available full time. I think we have room for improvement here &ndash; this is what my <a href="https://smallcultfollowing.com/babysteps/
/blog/2019/04/19/aic-adventures-in-consensus/">Adventures in Consensus</a> series is all about &ndash; but ain&rsquo;t that always the truth?</p>
<p>The trick of course is that in order to achieve &ldquo;open source whenever&rdquo;, you need full-time people to help pull it all together. This in many ways has been the limiting factor for Rust thus far, and it is precisely what these new Rust teams &ndash; with <a href="https://github.com/rust-lang/foundation-faq-2020/blob/main/FAQ.md#q-scope">support from the new Rust Foundation as well</a> &ndash; can and will change. We have a lot to look forward to!</p>
<h3 id="footnotes">Footnotes</h3>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>I&rsquo;ll let them make their own announcements.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Hat tip to Jessica Lord, whose post <a href="http://jlord.us/blog/osos-talk.html">&ldquo;Privilege, Community and Open Source&rdquo;</a> is one I still re-read regularly.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/></entry><entry><title type="html">Looking back on 2020</title><link href="https://smallcultfollowing.com/babysteps/blog/2020/12/18/looking-back-on-2020/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2020/12/18/looking-back-on-2020/</id><published>2020-12-18T00:00:00+00:00</published><updated>2020-12-18T00:00:00+00:00</updated><content type="html"><![CDATA[<p>I wanted to write a post that looks back over 2020 from a personal perspective. My goal here is to look at the various initiatives that I&rsquo;ve been involved in and try to get a sense for how they went, what worked and what didn&rsquo;t, and also what that means for next year. This post is a backdrop for a #niko2021 post that I plan to post sometime before 2021 actually starts, talking about what I expect to be doing in 2021.</p>
<p>I want to emphasize the &lsquo;personal&rsquo; bit. <strong>This is not meant as a general retrospective of what has happened in the Rust universe.</strong> I also don&rsquo;t mean to claim credit for all (or most) of the ideas on this list. Some of them are things I was at best tangentially involved in, but which I think are inspiring, and would inform events of next year.</p>
<h3 id="the-backdrop-total-hellscape">The backdrop: total hellscape</h3>
<p>It goes without saying that it was quite a year. It&rsquo;s impossible to ignore the pandemic, the killings of George Floyd, Breonna Taylor, Ahmaud Arbery, China&rsquo;s actions in Hong Kong, massive financial disruption, what can only be described as an attempt to steal the US election, and all the other things that are going on around us. Many of the <a href="https://blog.rust-lang.org/2020/08/18/laying-the-foundation-for-rusts-future.html">biggest events in Rust</a> were shaped by this global backdrop. If nothing else, it added to a general ambient stress level that made 2020 a very difficult year for me personally. Not to provide free advertising for anyone, but <a href="https://www.youtube.com/watch?v=qmb5ENInqVk">this match.com commercial really did capture it</a>. Here&rsquo;s to a better 2021. 🥂</p>
<h3 id="still-a-lot-of-good-stuff-happened">Still, a lot of good stuff happened</h3>
<p>Despite all of that, I am pretty proud of a number of developments around Rust that I have been involved in. I think we done a number of important things, and we have a number of really promising initiatives in flight as well that I think will come to fruition in 2021. I&rsquo;d like to talk about some of those.</p>
<p>Once I started compiling a list I realized there&rsquo;s an awful lot, so here is a kind of TL;DR where you can click for more details:</p>
<ul>
<li>Process and governance
<ul>
<li><a href="#mcp">The Major Change Process helped compiler team spend more time on design</a></li>
<li><a href="#pp">Lang Team Project Proposals show promise, but are a WIP</a></li>
<li><a href="#bb">The Lang Team&rsquo;s Backlog Bonanza was great, and should continue</a></li>
<li><a href="#fc">The Foundation Conversation was an interesting model I think we can apply elsewhere</a></li>
<li><a href="#f">The Foundation is very exciting</a></li>
</ul>
</li>
<li>Technical work
<ul>
<li><a href="#rfc2229">The group working on RFC 2229 (&ldquo;disjoint closure captures&rdquo;) is awesome</a></li>
<li><a href="#mvp">The MVP for const generics is great, and we should do more</a></li>
<li><a href="#sprint">Sprints for Polonius are a great model, we need more sprints</a></li>
<li><a href="#chalk">Chalk and designs for a shared type library</a></li>
<li><a href="#ffi-unwind">Progress on ffi-unwind</a></li>
<li><a href="#never">Progress on never type stabilization</a></li>
<li><a href="#async">Progress on Async Rust</a></li>
</ul>
</li>
</ul>
<p><a name="mcp"></a></p>
<h3 id="the-major-change-process-helped-compiler-team-spend-more-time-on-design">The Major Change Process helped compiler team spend more time on design</h3>
<p>One of the things I am most happy with is the compiler team&rsquo;s <a href="https://forge.rust-lang.org/compiler/mcp.html">Major Change Process</a>. For those not familiar with it, the idea is simple: if you would like to make a Major Change to the compiler (defined loosely as &ldquo;something that would change documentation in the <a href="https://github.com/rust-lang/rustc-dev-guide">rustc-dev-guide</a>&rdquo;), then you first open an issue (called a Major Change Proposal, or MCP) on the compiler-team repository. In that issue, you describe roughly the idea. This also automatically opens a Zulip thread in <a href="https://zulip-archive.rust-lang.org/233931tcompilermajorchanges/index.html">#t-compiler/major changes</a> for discussion. If somebody on the compiler team likes the idea, they &ldquo;second&rdquo; the proposal. This automatically starts off a Final Comment Period of 10 days. At the end of that, the MCP is approved.</p>
<p>The goal of MCPs is two-fold. The first, and most important, goal is to encourage more design discussion. It would sometimes happen that we have large PRs opened with little or not indication of the greater design that they were shooting for, which made it really hard to review. We can now tell the authors of such PRs &ldquo;please write an MCP describing the design you have in mind here&rdquo;. The second goal is to give us a lightweight way to make decisions. It would sometimes happen that PRs kind of get stuck without a clear &ldquo;decision&rdquo; having been made.</p>
<p>The MCP process is not without its problems. We recently did a <a href="https://rust-lang.github.io/compiler-team/minutes/design-meeting/2020-09-18-mcp-retrospective/">retrospective</a> and while I think the first goal (&ldquo;design feedback&rdquo;) has been a big success, the second goal (&ldquo;clearer decisions&rdquo;) is a mixed bag. We&rsquo;ve definitely had problems where MCPs were approved but people didn&rsquo;t feel their objections had been heard. I think we&rsquo;ll wind up tweaking the process to better account for that.</p>
<p><a name="pp"></a></p>
<h3 id="lang-team-project-proposals-show-promise-but-are-a-wip">Lang Team Project Proposals show promise, but are a WIP</h3>
<p>In the lang team, we have been experimenting on a change to our process we call <a href="https://blog.rust-lang.org/inside-rust/2020/10/16/Backlog-Bonanza.html">&ldquo;project proposals&rdquo;</a>. The idea is that, before writing an RFC, you can write a more lightweight proposal to take the temperature of the lang team. We will take a look and decide whether what we think, which might be one of a few things:</p>
<ul>
<li><strong>Suggest implementing:</strong> The idea is good and it is small enough that we think you can just go straight to implementation.</li>
<li><strong>Needs an RFC:</strong> The idea is good but it ought to have an RFC. We&rsquo;ll assign a liaison to work with you towards fleshing it out.</li>
<li><strong>Close:</strong> We don&rsquo;t feel this idea is a good fit right now.</li>
</ul>
<p>I had a lot of goals in mind for project proposals. First, to help us avoid RFC limbo and <a href="http://smallcultfollowing.com/babysteps/blog/2019/07/10/aic-unbounded-queues-and-lang-design/">unbounded queues</a>. I want to get to the point where the only open RFCs on the repository are ones that are generally backed by the lang team, so that the team is able to keep up with the traffic on them and keep the process moving. But I want to do this without cutting off the potential for people to bring up interesting ideas that weren&rsquo;t on the team radar.</p>
<p>Another goal is to <strong>support RFC authors better</strong>. One bit of feedback I&rsquo;ve received over the years numerous times is that people are intimidated to author RFCs, or consider it too much of a hassle. The idea of assigning a liaison is that they can help on the RFC and give guidance, while also keeping the broader team in the loop.</p>
<p>Finally, I hope that liaisons can serve as part of a clearer <a href="https://blog.rust-lang.org/inside-rust/2020/07/09/lang-team-path-to-membership.html"><strong>path to lang-team membership</strong></a>. The idea is that serving as the liaison for a project can be a way for us to see how people would be as a member of the lang-team and possibly recruit new members.</p>
<p>I would say that the &ldquo;project&rdquo; system has been a mixed success. We&rsquo;ve had a number of successful project groups, but we&rsquo;ve also had some that are slow to start. We&rsquo;ve not done a great job of recruiting fresh liaisons and I think the role could use more definition. Finally, we need to have much clearer messaging, and a more finalized &ldquo;decision&rdquo; around the RFC process &ndash; I&rsquo;m also concerned if the RFC process starts to diverge too much between teams. I think it&rsquo;s quite confusing for people right now to know how they&rsquo;re supposed to &ldquo;pitch&rdquo; an idea (and people are often unclear which team is the best fit for an idea).</p>
<p>Josh and I have been iterating on a more complete &ldquo;staged RFC&rdquo; proposal that aims to address a number of those points (it&rsquo;s a refinement and iteration on the <a href="http://smallcultfollowing.com/babysteps/blog/2018/06/20/proposal-for-a-staged-rfc-process/">older staged RFC idea</a> that I wrote about years ago). This is one of the things I&rsquo;d really like to focus on next year, along with improving and defining the lang team liaison process.</p>
<p><a name="bb"></a></p>
<h3 id="the-lang-teams-backlog-bonanza-was-great-and-should-continue">The Lang Team&rsquo;s Backlog Bonanza was great, and should continue</h3>
<p>This year the lang team did a series of sync meetings that we called the <a href="https://blog.rust-lang.org/inside-rust/2020/10/16/Backlog-Bonanza.html">&ldquo;Backlog Bonanza&rdquo;</a>, where we went through every pending RFC and tried to figure out what to do with it. This was great not only because we were able to give feedback on every open RFC and (mostly) determine what to do with it<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>, but also as a &rsquo;team bonding&rsquo; exercise (at least I thought so). It helped us to sharpen what kinds of things we think are important.</p>
<p>Next year I hope to extend the Backlog Bonanza towards triaging open tracking issues and features. I&rsquo;d like this to fit in with the work towards tracking projects. Ideally we&rsquo;d get to the point where you can very easily tell &ldquo;what are the projects that are likely to be stabilized soon&rdquo;, &ldquo;what are the projects that could use my help&rdquo;, and &ldquo;what are the projects that are stalled out&rdquo; (along with other similar questions).</p>
<p><a name="fc"></a></p>
<h3 id="the-foundation-conversation-was-an-interesting-model-i-think-we-can-apply-elsewhere">The Foundation Conversation was an interesting model I think we can apply elsewhere</h3>
<p>One of the things that&rsquo;s been on my mind this year is that we need to be looking for new ways to get &ldquo;beyond the comment thread&rdquo; when it comes to engaging with Rust users and getting design feedback. Comment threads are flexible and sometimes fantastic but prone to all kinds of problems, particularly on controversial or complex topics. Last year I wrote about <a href="http://smallcultfollowing.com/babysteps/blog/2019/04/22/aic-collaborative-summary-documents/">Collaborative Summary Documents</a> as an alternative to comment threads. This year we tried out the <a href="https://blog.rust-lang.org/2020/12/07/the-foundation-conversation.html">Foundation Conversation</a><sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>, and I thought it worked out quite well. I particularly enjoyed the Github Q&amp;A aspect of it.<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup> It seemed like a good way to take questions and share information.</p>
<p>The way we ran it, for future reference, was as follows:</p>
<ul>
<li>Open a github repo for a period of time to take questions.</li>
<li>We had a zoom call going with the team all present.</li>
<li>When new issues were opened, we would briefly discuss and assign someone to write a response. After some period of time, we&rsquo;d review the response and suggest edits (or someone else might take over). This repeated until consensus was reached.</li>
<li>At the end of the day, we collected the answers into a <a href="https://github.com/rust-lang/foundation-faq-2020/blob/main/FAQ.md">FAQ</a>.</li>
</ul>
<p>I feel like this might be an interesting model to use or adapt for other purposes. It might have been a nice way to take feedback on async-await syntax, for example, or other extremely controversial topics. In these cases there is often a lot of context that the team has acquired but it is difficult to &ldquo;share it&rdquo;.</p>
<p>(One thing I&rsquo;ve always wanted to do is to collect feedback via google forms or e-mails. We would then read and think about the feedback, maybe contact the authors, and produce a new design in response; we would also publish the feedback we got and our thoughts.)</p>
<p><a name="f"></a></p>
<h3 id="the-foundation-is-very-exciting">The Foundation is very exciting</h3>
<p>A large part of my life this year has been spent learning and working towards the creation of a Rust Foundation, and I&rsquo;m very excited that it&rsquo;s finally <a href="https://blog.rust-lang.org/2020/12/14/Next-steps-for-the-foundation-conversation.html">taking shape</a>. I think that the Foundation&rsquo;s mission of <strong>empowering Rust maintainers to joyfully do their best work</strong> is tremendously important, and I think it will provide a venue for us to do things on Rust that would be hard to do otherwise. If you want to learn more about it, check out the Foundation <a href="https://github.com/rust-lang/foundation-faq-2020/blob/main/FAQ.md">FAQ</a> or our <a href="https://twitter.com/rustlang/status/1336807743974481920">live</a> <a href="https://twitter.com/rustlang/status/1337505108599386112">broadcasts</a>.</p>
<p>While I&rsquo;m on the topic, I want to say that I think Mozilla deserves a lot of credit here. It&rsquo;s not every company that would embark on a project like Rust, much less launch it out into an independent foundation. Huzzah!</p>
<p><a name="rfc2229"></a></p>
<h3 id="the-group-working-on-rfc-2229-disjoint-closure-captures-is-awesome">The group working on RFC 2229 (&ldquo;disjoint closure captures&rdquo;) is awesome</h3>
<p><a href="https://rust-lang.github.io/rfcs/2229-capture-disjoint-fields.html">RFC 2229</a> proposed a change to how closure capture works. Consider a closure like <code>|| some_func(&amp;a.b.c)</code>. Today, that closure will capture the entire variable <code>a</code>. Under <a href="https://rust-lang.github.io/rfcs/2229-capture-disjoint-fields.html">RFC 2229</a>, it would capture <code>a.b.c</code>, which can avoid a number of unnecessary borrow checker conflicts.</p>
<p><a href="https://rust-lang.github.io/rfcs/2229-capture-disjoint-fields.html">RFC 2229</a> was approved in 2018 but implementation was stalled while we worked on NLL and other details. Recently though an <a href="https://www.rust-lang.org/governance/teams/compiler#wg-rfc-2229">excellent group of folks</a> decided to take on the implementation work. Over the past year, I&rsquo;ve been working with them on the design and implementation, and we&rsquo;ve been making steady progress. The feature is now at the point where it &ldquo;basically works&rdquo; and we are working on migration (enabling this feature will require a Rust edition, as it would otherwise change the semantics of existing programs). A particular shout out to <a href="https://github.com/arora-aman">arora-aman</a>, who has been the &ldquo;point person&rdquo; for the group, helping to collect questions, relay answers, and generally keep things organized.</p>
<p>Given the great progress we&rsquo;ve been making, I am quite hopeful that we&rsquo;ll see this feature land as part of a 2021 Rust Edition. The only caveat is that doing the implementation work has raised some questions about the best behavior for <code>move</code> closures and the like, so we may need to do a bit more design iteration before we are fully satisfied.</p>
<p><a name="mvp"></a></p>
<h3 id="the-mvp-for-const-generics-is-great-and-we-should-do-more">The MVP for const generics is great, and we should do more</h3>
<p>Const generics has been one of those &rsquo;long awaited&rsquo; features whose fate often felt very uncertain. In July, boats <a href="https://without.boats/blog/shipping-const-generics/">proposed</a> a kind of &ldquo;MVP&rdquo; for const generics &ndash; a simple subset that enables a number of important use cases and sidesteps some of the areas where the implementation work isn&rsquo;t done yet. We now have a <a href="https://github.com/rust-lang/rust/pull/79135">stabilization PR for that subset in FCP</a>, thanks to a lot of tireless work by <a href="https://github.com/lcnr/">lcnr</a>, <a href="github.com/varkor">varkor</a>, and others.</p>
<p>I&rsquo;m very excited about this for two reasons. First, I think the MVP will be really useful to library authors. But secondly, I think this &ldquo;MVP&rdquo; strategy that we should be deploying more often. For example, oli, matthewjasper and I recently outlined a kind of &ldquo;MVP&rdquo; for &ldquo;named impl trait&rdquo;, though we have yet to describe or fully propose it. =)</p>
<p>This idea of pushing an MVP to conclusion is something we&rsquo;ve done a number of times in Rust in the past, but it&rsquo;s one of those strategies that are easy to forget about it when you&rsquo;re in the thick of trying to work through some problem. I&rsquo;m hopeful that in 2021 we can make progress on some of our longer running initiatives in this way.</p>
<p><a name="sprint"></a></p>
<h3 id="sprints-for-polonius-are-a-great-model-we-need-more-sprints">Sprints for Polonius are a great model, we need more sprints</h3>
<p>Polonius is another project that has been making slow progress, mostly because other things keep taking higher priority. This year we tried a new approach to working on it, which was to schedule a &ldquo;sprint week&rdquo;. The idea was that the entire group would reserve time in their schedules and spend about 4 hours a day over the course of one week to <em>just focus on polonius</em> (some people spent more). For projects like polonius, this kind of concentrated attention is really useful, because there is a lot of context you have to build up in your head in order to make progress.</p>
<p>In a <a href="https://zulip-archive.rust-lang.org/238009tcompilermeetings/99285steeringmeeting20201204PerformanceGoalsfor2020.html">recent compiler team meeting</a>, we discussed the idea of using these &ldquo;sprints&rdquo; more generally. For example, we considered having a bi-monthly compiler team sprint, where we would encourage the team (and new contributors!) to clear space in their schedules to help push progress on a particular goal.</p>
<p>I&rsquo;ve heard from many part-time contributors that this kind of sprint approach can be really useful, as it&rsquo;s easier to get support for a &ldquo;week of concentrated work&rdquo; than for a &ldquo;steady drip&rdquo; of tasks. (In the latter case, it&rsquo;s easy for those tasks to always be pre-empted by higher priorities work items.) It also can create a nice sense of community.</p>
<p><a name="chalk"></a></p>
<h3 id="chalk-and-designs-for-a-shared-type-library">Chalk and designs for a shared type library</h3>
<p>Speaking of community, the Chalk project continues to advance, although with the work on the Foundation I at least have not been able to pay as much attention as I would like. Chalk&rsquo;s integration with rustc has made great progress, and it&rsquo;s still being used by rust-analyzer as the main trait engine. Lately our focus has been the shared type library that I <a href="https://rust-lang.github.io/compiler-team/minutes/design-meeting/2020-03-12-shared-library-for-types/">first proposed in March</a>. A huge shoutout to <a href="https://github.com/jackh726/">jackh726</a>, who has not only been writing a lot of great PRs, but also doing a lot of the organizational work. I expect this to be a continued area of focus in 2021.</p>
<p><a name="ffi-unwind"></a></p>
<h3 id="progress-on-ffi-unwind">Progress on ffi-unwind</h3>
<p>Unwinding across FFI boundaries has been a persistent annoying pain point for years. We generally wanted it to be UB, but there are some use cases that demand it. Plus, understanding unwinding is really complex and involves lots of grungy platform details. This is a perfect recipe for inaction. This year the ffi-unwind project group finally took the time to dive into the options and make a proposal, resulting in <a href="https://github.com/rust-lang/rfcs/pull/2945">RFC 2945</a> (which now has a <a href="https://github.com/rust-lang/rust/pull/76570">pending implementation PR</a>). Hat tip to <a href="https://github.com/Amanieu">Amanieu</a>, <a href="https://github.com/BatmanAoD">BatmanAoD</a>, and <a href="https://github.com/katie-martin-fastly">katie-martin-fastly</a> for their work on this.</p>
<p><a name="never"></a></p>
<h3 id="progress-on-never-type">Progress on never-type</h3>
<p>Stabilizing the never type (<code>!</code>) is another of those long-standing endeavors that keeps getting blocked by one problem or another. Over the last few months I spent some time working with <a href="https://github.com/blitzerr/">blitzerr</a> to create a <a href="https://github.com/rust-lang/rust/issues/66173">lint for tainted fallback</a>. We succeeded in writing the lint, but found it opened up some new issues, which gave rise to a fresh idea for how to approach fallback which I implemented in <a href="https://github.com/rust-lang/rust/pull/79366">#79366</a>. I haven&rsquo;t had time to revisit this since we did a crater run to assess impact, but I&rsquo;m hopeful that we&rsquo;ll be able to finally stabilize the never type in 2021.</p>
<p><a name="async"></a></p>
<h3 id="progress-on-async-rust">Progress on Async Rust</h3>
<p>tmandry has been leading the &ldquo;async foundations working group&rdquo; for some time. The group has been slowly expanding its focus from polish and fixing bugs towards new RFCs and efforts:</p>
<ul>
<li>nellshamrell opened an RFC <a href="https://github.com/rust-lang/rfcs/pull/2996">stabilizing the <code>Stream</code> trait</a>, currently in &ldquo;pre-FCP&rdquo;, and yoshuawuyts opened a <a href="https://github.com/rust-lang/rust/pull/79023">PR with an unstable implementation</a></li>
<li>blgBV and LucioFranco opened an <a href="https://github.com/rust-lang/rfcs/pull/3014">RFC for a &ldquo;must not await&rdquo; lint</a> to help catch values that are live across an await, but should not be</li>
<li>while this is not an &ldquo;async&rdquo;-specific effort, sfackler landed an RFC for <a href="https://github.com/rust-lang/rfcs/pull/2930">reading into uninitialized buffers</a>, which potentially unlocks progress on <code>AsyncRead</code>, as <a href="https://smallcultfollowing.com/babysteps/blog/2020/01/20/async-interview-5-steven-fackler/">he and I discussed in our async interview</a></li>
<li>continued smaller stabilizations of useful bits of functionality, like <a href="https://github.com/rust-lang/rust/pull/74328"><code>core::future::ready</code></a></li>
</ul>
<p>In general, I thought the <a href="https://smallcultfollowing.com/babysteps/blog/2019/11/22/announcing-the-async-interviews/">Async Interviews</a> were a good experience, and I&rsquo;d like to do more things like that as a way to dig into technical questions. (I actually have one interview that I never got around to publishing &ndash; oops. I should do that!)</p>
<h3 id="conclusion-and-some-personal-thoughts">Conclusion and some personal thoughts</h3>
<p>Well, the end of 2020 is coming up quick. We did it. I want to wish all of you a happy end of the year, and encourage everyone to relax and take it easy on yourselves. Despite all odds, I think it&rsquo;s been a pretty good year for Rust. People who know me know that I have a hard time feeling &ldquo;satisfied&rdquo;<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup>. I don&rsquo;t like to count chickens, and I tend to think things will go wrong<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup>. Well, as of this year, even <strong>I</strong> can plainly see that &ldquo;Rust has made it&rdquo;. Every day I am learning about new uses for Rust. This isn&rsquo;t to say we&rsquo;re done, there&rsquo;s still plenty to do, but I think we can really take pride in having achieved what initially seemed impossible: launching a new systems programming language into widespread use.</p>
<h3 id="footnotes">Footnotes</h3>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>In some cases, we still need to complete the follow-up work, I think, of actually closing and commenting on those RFCs.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Hat tip to [Ashley Williams][ag_dubs] for proposing this communication plan.
[ag_dubs]: <a href="https://twitter.com/ag_dubs">https://twitter.com/ag_dubs</a>&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Well, that and the <a href="https://twitter.com/nikomatsakis/status/1337715789852532736">crude digital editing</a>.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>Working on it.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>The major exception is when I am preparing my To Do list. In that case, I seem to think that nothing unexpected ever happens and there are 72 hours in the day.&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/></entry><entry><title type="html">Rotating the compiler team leads</title><link href="https://smallcultfollowing.com/babysteps/blog/2020/12/11/rotating-the-compiler-team-leads/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2020/12/11/rotating-the-compiler-team-leads/</id><published>2020-12-11T00:00:00+00:00</published><updated>2020-12-11T00:00:00+00:00</updated><content type="html"><![CDATA[<p>Since we created the Rust teams, I have been serving as lead of two teams: the <a href="https://www.rust-lang.org/governance/teams/compiler">compiler team</a> and the <a href="https://www.rust-lang.org/governance/teams/lang">language design team</a> (I&rsquo;ve also been a member of the <a href="https://www.rust-lang.org/governance/teams/core">core team</a>, which has no lead). For those less familiar with Rust&rsquo;s governance, the compiler team is focused on the maintenance and implementation of the compiler itself (and, more recently, the standard library). The language design team is focused on the design aspects. Over that time, all the Rust teams have grown and evolved, with the compiler team in particular being home to a number of really strong members.</p>
<p>Last October, <a href="https://blog.rust-lang.org/inside-rust/2019/10/24/pnkfelix-compiler-team-co-lead.html">I announced that pnkfelix was joining me as compiler team co-lead</a>. Today, I am stepping back from my role as compiler team co-lead altogether. After taking nominations from the compiler team, pnkfelix and I are proud to announce that <strong><a href="https://github.com/wesleywiser">wesleywiser</a> will replace me as compiler team co-lead</strong>. If you don&rsquo;t know Wesley, there&rsquo;ll be an announcement on <a href="https://blog.rust-lang.org/inside-rust/">Inside Rust</a> where you can learn a bit more about what he has done, but let me just say I am pleased as punch that he agreed to serve as co-lead. He&rsquo;s going to do a great job.</p>
<h3 id="youre-not-getting-rid-of-me-this-easily">You&rsquo;re not getting rid of me this easily</h3>
<p>Stepping back as compiler team co-lead does not mean I plan to step away from the compiler. In fact, quite the opposite. I&rsquo;m still quite enthusiastic about pushing forward on ongoing implementaton efforts like the work to implement <a href="https://github.com/rust-lang/rust/issues/53488">RFC 2229</a>, or the development on <a href="https://github.com/rust-lang/chalk">chalk</a> and <a href="https://github.com/rust-lang/polonius">polonius</a>. In fact, I am hopeful that stepping back as co-lead will create more time for these efforts, as well as time to focus on leadership of the language design team.</p>
<h3 id="rotation-is-key">Rotation is key</h3>
<p>I see these changes to compiler team co-leads as fitting into a larger trend, one that I believe is going to be increasingly important in Rust: <strong>rotation of leadership</strong>. To me, the &ldquo;corest of the core&rdquo; value of the Rust project is the importance of <a href="https://github.com/rust-lang/foundation-faq-2020/blob/main/FAQ.md#q-sharing-experience">&ldquo;learning from others&rdquo;</a> &ndash; or as I put it in <a href="https://nikomatsakis.github.io/rust-latam-2019/#94">my rust-latam talk from 2019</a><sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>, &ldquo;a commitment to a CoC and a culture that emphasizes curiosity and deep research&rdquo;. <strong>Part of learning from others has to be actively seeking out fresh leadership and promoting them into positions of authority.</strong></p>
<h3 id="but-rotation-has-a-cost-too">But rotation has a cost too</h3>
<p>Another core value of Rust is <a href="https://smallcultfollowing.com/babysteps/
/blog/2019/04/19/aic-adventures-in-consensus/">recognizing the inevitability of tradeoffs</a><sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>. Rotating leadership is no exception: there is a lot of value in having the same people lead for a long time, as they accumulate all kinds of context and skills. But it also means that you are missing out on the fresh energy and ideas that other people can bring to the problem. I feel confident that Felix and Wesley will help to shape the compiler team in ways that I never would&rsquo;ve thought to do.</p>
<h3 id="rotation-with-intention">Rotation with intention</h3>
<p>The tradeoff between experience and enthusiasm makes it all the more important, in my opinion, to rotate leadership intentionally. I am reminded of <a href="http://edunham.net/2018/05/15/team.html">Emily Dunham&rsquo;s classic post on leaving a team</a><sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>, and how it was aimed at normalizing the idea of &ldquo;retirement&rdquo; from a team as something you could actively choose to do, rather than just waiting until you are too burned out to continue.</p>
<p>Wesley, Felix, and I have discussed the idea of &ldquo;staggered terms&rdquo; as co-leads. The idea is that you serve as co-lead for two years, but we select one new co-lead per year, with the oldest co-lead stepping back. This way, at every point you have a mix of a new co-lead and someone who has already done it for one year and has some experience.</p>
<h3 id="lang-and-compiler-need-separate-leadership">Lang and compiler need separate leadership</h3>
<p>Beyond rotation, another reason I would like to step back from being co-lead of the compiler team is that I don&rsquo;t really think it makes sense to have one person lead two teams. It&rsquo;s too much work to do both jobs well, for one thing, but I also think it works to the detriment of the teams. I think the compiler and lang team will work better if they each have their own, separate &ldquo;advocates&rdquo;.</p>
<p>I&rsquo;m actually very curious to work with pnkfelix and Wesley to talk about how the teams ought to coordinate, since I&rsquo;ve always felt we could do a better job. I would like us to be actively coordinating how we are going to manage the implementation work at the same time as we do the design, to help avoid <a href="https://smallcultfollowing.com/babysteps/
/blog/2019/07/10/aic-unbounded-queues-and-lang-design/">unbounded queues</a>. I would also like us to be doing a better job getting feedback from the implementation and experimentation stage into the lang team.</p>
<p>You might think having me be the lead of both teams would enable coordination, but I think it can have the opposite effect. Having separate leads for compiler and lang means that those leads must actively communicate and avoids the problem of one person just holding things in their head without realizing other people don&rsquo;t share that context.</p>
<h3 id="idea-deliberate-team-structures-that-enable-rotation">Idea: Deliberate team structures that enable rotation</h3>
<p>In terms of the compiler team structure, I think there is room for us to introduce &ldquo;rotation&rdquo; as a concept in other ways as well. Recently, I&rsquo;ve been <a href="https://zulip-archive.rust-lang.org/185694tcompilerwgmeta/79956compilerteamofficers.html">kicking around an idea for &ldquo;compiler team officers&rdquo;</a><sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup>, which would introduce a number of defined roles, each of which is setup in with staggered terms to allow for structured handoff. I don&rsquo;t think the current proposal is quite right, but I think it&rsquo;s going in an intriguing direction.</p>
<p>This proposal is trying to address the fact that a successful open source organization needs <a href="https://smallcultfollowing.com/babysteps/
/blog/2019/04/15/more-than-coders/">more than coders</a>, but all too often we fail to recognize and honor that work. Having fixed terms is important because when someone <em>is</em> willing to do that work, they can easily wind up getting stuck being the only one doing it, and they do that until they burn out. The proposal also aims to enable more &ldquo;part-time&rdquo; leadership within the compiler team, by making &ldquo;finer grained&rdquo; duties that don&rsquo;t require as much time to complete.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Oh-so-subtle plug: I really quite liked that talk.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Though not always the tradeoffs you expect. <a href="https://smallcultfollowing.com/babysteps/
/blog/2019/04/19/aic-adventures-in-consensus/">Read the post.</a>&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>If you haven&rsquo;t read it, stop reading now and <a href="http://edunham.net/2018/05/15/team.html">go do so</a>. Then come back. Or don&rsquo;t. <a href="http://edunham.net/2018/05/15/team.html">Just read it already.</a>&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>I am not sure that &lsquo;officer&rsquo; is the right word here, but I&rsquo;m not sure what the best replacement is. I want something that conveys respect and responsibility.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><title type="html">Async Interview #8: Stjepan Glavina</title><link href="https://smallcultfollowing.com/babysteps/blog/2020/07/09/async-interview-8-stjepan-glavina/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2020/07/09/async-interview-8-stjepan-glavina/</id><published>2020-07-09T00:00:00+00:00</published><updated>2020-07-09T00:00:00+00:00</updated><content type="html">&lt;p>(removed)&lt;/p>
</content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/asyncinterviews" term="asyncinterviews" label="AsyncInterviews"/></entry><entry><title type="html">Async interviews: my take thus far</title><link href="https://smallcultfollowing.com/babysteps/blog/2020/04/30/async-interviews-my-take-thus-far/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2020/04/30/async-interviews-my-take-thus-far/</id><published>2020-04-30T00:00:00+00:00</published><updated>2020-04-30T00:00:00+00:00</updated><content type="html"><![CDATA[<p>The point of the <a href="https://smallcultfollowing.com/babysteps/
/blog/2019/11/22/announcing-the-async-interviews/">async interview</a> series, in the end, was to help
figure out what we should be doing next when it comes to Async I/O. I
thought it would be good then to step back and, rather than
interviewing someone else, give my opinion on some of the immediate
next steps, and a bit about the medium to longer term. I&rsquo;m also going
to talk a bit about what I see as some of the practical challenges.</p>
<h3 id="focus-for-the-immediate-term-interoperability-and-polish">Focus for the immediate term: interoperability and polish</h3>
<p>At the highest level, I think we should be focusing on two things in
the &ldquo;short to medium&rdquo; term: <strong>enabling interoperability</strong> and
<strong>polish</strong>.</p>
<p>By <strong>interoperability</strong>, I mean the ability to write libraries and
frameworks that can be used with many different executors/runtimes.
Adding the <code>Future</code> trait was a big step in this direction, but
there&rsquo;s plenty more to go.</p>
<p>My dream is that eventually people are able to write portable async
apps, frameworks, and libraries that can be moved easily between async
executors. We won&rsquo;t get there right away, but we can get closer.</p>
<p>By <strong>polish</strong>, I mean &ldquo;small things that go a long way to improving
quality of life for users&rdquo;. These are the kinds of things that are
easy to overlook, because no individual item is a big milestone.</p>
<h3 id="polish-in-the-compiler-diagnostics-lints-smarter-analyses">Polish in the compiler: diagnostics, lints, smarter analyses</h3>
<p>Most of the focus of <a href="https://github.com/rust-lang/wg-async-foundations">wg-async-foundations</a> recently has been on
polish work on the compiler, and we&rsquo;ve made quite a lot of
progress. Diagnostics have <a href="https://github.com/rust-lang/rust/pull/64895">notably</a> <a href="https://github.com/rust-lang/rust/pull/65345">improved</a>, and we&rsquo;ve been
working on <a href="https://github.com/rust-lang/rust/pull/70906">inserting</a> <a href="https://github.com/rust-lang/rust/pull/68212">helpful</a> <a href="https://github.com/rust-lang/rust/pull/71174">suggestions</a>, <a href="https://github.com/rust-lang/rust/pull/68884">fixing compiler
bugs</a>, and <a href="https://github.com/rust-lang/rust/pull/69837">improving efficiency</a>. One thing I&rsquo;m especially excited
about is that we <a href="https://github.com/rust-lang/rust/pull/69033">no longer rely on thread-local storage in the <code>async fn</code> transformation</a>, which means that async-await is now
compatible with <code>#[no_std]</code> environments and hence embedded
development.</p>
<p>I want to give a 👏 &ldquo;shout-out&rdquo; 👏 to 👏 <a href="https://github.com/tmandry">tmandry</a> 👏 for leading this
polish effort, and to point out that if you&rsquo;re interested in
contributing to the compiler, this is a great place to start! Here are
some <a href="https://github.com/rust-lang/wg-async-foundations#getting-involved">tips for how to get involved</a>.</p>
<p>I think it&rsquo;s also a good idea to be looking a bit more broadly.  On
Zulip, for example, <a href="https://zulip-archive.rust-lang.org/187312wgasyncfoundations/81944meeting20200428.html#195598667">LucioFranco suggested</a> that we could add a
lint to warn about things that should not be live across yields (e.g.,
lock guards), and I think that&rsquo;s a great idea (there is a <a href="https://github.com/rust-lang/rust-clippy/issues/4226">clippy
lint</a> already, though it&rsquo;s specific to <code>MutexGuard</code>; maybe this should
just be promoted to the compiler and generalized).</p>
<p>Another, more challenging area is improving the precision of the
async-await transformation and analysis. Right now, for example, the
compiler &ldquo;overapproximates&rdquo; what values are live across a yield, which
sometimes yields spurious errors about whether a future needs to be
<code>Send</code> or not. Fixing this is, um, &ldquo;non-trivial&rdquo;, but it would be a
major quality of life improvement.</p>
<h3 id="polish-in-the-standard-library-adding-utilities">Polish in the standard library: adding utilities</h3>
<p>When it comes to polish, I think we can extend that focus beyond the
compiler, to the standard library and the language. I&rsquo;d like to see
the stdlib include building blocks like async-aware mutexes and
channels, for example, as well as smaller utilities like
<a href="http://smallcultfollowing.com/babysteps/blog/2020/03/10/async-interview-7-withoutboats/#block_on-in-the-std-library"><code>task::block_on</code></a>. YoshuaWuyts recently proposed adding some simple
constructors, like <a href="https://github.com/rust-lang/rust/pull/70834"><code>future::{pending, ready}</code></a> which I think could
fit in this category. A key constraint here is that these should be
libraries and APIs that are portable across all executors and
runtimes.</p>
<h3 id="polish-in-the-language-async-main-async-drop">Polish in the language: async main, async drop</h3>
<p>Polish extends to the language, as well. The idea here is to find
small, contained changes that fix specific pain points or limitations.
Adding <code>async fn main</code>, as <a href="http://smallcultfollowing.com/babysteps/blog/2020/03/10/async-interview-7-withoutboats/#async-fn-main">boats proposed</a>, might be such an example
(and I rather like the idea of <code>#[test]</code> that <a href="https://users.rust-lang.org/t/async-interviews/35167/17?u=nikomatsakis">XAMPRocky proposed on
internals</a>).</p>
<p>Another change I think makes sense is to support <a href="http://smallcultfollowing.com/babysteps/blog/2020/03/10/async-interview-7-withoutboats/#next-step-async-destructors">async destructors</a>,
and I would go further and adopt find some solution to the concerns
about RAII and async that <a href="http://smallcultfollowing.com/babysteps/blog/2020/02/11/async-interview-6-eliza-weisman/#raii-and-async-fn-doesnt-always-play-well">Eliza Weisman raised</a>. In particular, I
think we need some kind of (optional) callback for values that reside
on a stack frame that is being suspended.</p>
<h3 id="supporting-interoperability-the-stream-trait">Supporting interoperability: the stream trait</h3>
<p>Let me talk a bit about what we can do to support interoperability.
The first step, I think, is to do <a href="http://smallcultfollowing.com/babysteps/blog/2019/12/23/async-interview-3-carl-lerche/#what-should-we-do-next-stabilize-stream">as Carl Lerche proposed</a> and add
the <a href="https://docs.rs/futures/0.3.4/futures/stream/trait.Stream.html"><code>Stream</code></a> trait into the standard library. Ideally, it would be
added in <em>exactly</em> the form that it takes in futures 0.3.4, so that we
can release a (minor) version of futures that simply re-exports the
stream trait from the stdlib.</p>
<p>Adding stream enables interoperability in the same way that adding
<code>Future</code> did: one can now define libraries that produce streams, or
which operate on streams, in a completely neutral fashion.</p>
<h3 id="but-what-about-attached-streams">But what about &ldquo;attached streams&rdquo;?</h3>
<p>I said that I did not think adding <code>Stream</code> to the standard library
would be controversial. This does not mean there aren&rsquo;t any concerns.
cramertj, in particular, <a href="http://smallcultfollowing.com/babysteps/blog/2019/12/10/async-interview-2-cramertj-part-2/">raised a concern</a> about the desire for
&ldquo;attached streams&rdquo; (or &ldquo;streaming streams&rdquo;), as they are sometimes
called.</p>
<p>To review, today&rsquo;s <code>Stream</code> trait is basically the exact async analog
of <code>Iterator</code>. It has a <a href="https://docs.rs/futures/0.3.4/futures/stream/trait.Stream.html#tymethod.poll_next"><code>poll_next</code></a> method that tries to fetch the
next item. If the item is ready, then the caller of <code>poll_next</code> gets
ownership of the item that was produced. This means in particular that
the item cannot be a reference into the stream itself. The same is
true of iterators today: iterators cannot yield references into
themselves (though they <em>can</em> yield references into the collection
that one is iterating over). This is both useful (it means that
generic callers can discard the iterator but keep the items that were
produced) and a limitation (it means that iterators/streams cannot
reuse some internal buffer between iterations).</p>
<h3 id="we-should-not-block-progress-on-streams-on-gats">We should not block progress on streams on GATs</h3>
<p>I hear the concern about attached streams, but I don&rsquo;t think it should
block us from moving forward. There are a few reasons for this. The
first is pragmatic: fully resolving the design details around attached
streams will require not only <a href="http://smallcultfollowing.com/babysteps/blog/2019/12/10/async-interview-2-cramertj-part-2/#the-natural-way-to-write-attached-streams-is-with-gats">GATs</a>, but experience with GATs. This
is going to take time and I don&rsquo;t think we should wait. Just as
iterators are used everywhere in their current form, there are plenty
of streaming appplications for which the current stream trait is a
good fit.</p>
<h3 id="symmetry-between-sync-and-async-is-a-valuable-principle">Symmetry between sync and async is a valuable principle</h3>
<p>There is another reason I don&rsquo;t think we should block progress on
attached streams. I think there is a lot of value to having symmetric
sync/async versions of things in the standard library. I think boats
had it right when they said that the <a href="http://smallcultfollowing.com/babysteps/blog/2020/03/10/async-interview-7-withoutboats/#vision-for-async">guiding vision</a> for Async I/O in
Rust should be that one can take sync code and make it async by adding
in <code>async</code> and <code>await</code> as necessary.</p>
<p>This isn&rsquo;t to say that everything between sync and async must be the
same. There will likely be things that only make sense in one setting
or another.  But I think that in cases where we see <em>orthogonal
problems</em> &ndash; problems that are not really related to being synchronous
or asynchronous &ndash; we should try to solve them in a uniform way.</p>
<p>In this case, the problem of &ldquo;attached&rdquo; vs &ldquo;detached&rdquo; is orthogonal
from being async or sync. We want attached iterators just as much as
we want attached streams &ndash; and we are making progress on the
foundational features that will enable us to have them.</p>
<p>Once we have those features, we can design variants of <code>Iterator</code> and
<code>Stream</code> that support attached iterators/streams. Perhaps these
variants will deprecate the existing traits, or perhaps they will live
alongside them (or maybe we can even find a way to extend the existing
traits in place). I don&rsquo;t know, but we&rsquo;ll figure it out, and we&rsquo;ll do
it for both sync and async applications, well, synchronously<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>.</p>
<h3 id="supporting-interoperability-adding-async-read-and-write-traits">Supporting interoperability: adding async read and write traits</h3>
<p>I also think we should add <a href="https://docs.rs/futures/0.3.4/futures/io/trait.AsyncRead.html"><code>AsyncRead</code></a> and <a href="https://docs.rs/futures/0.3.4/futures/io/trait.AsyncWrite.html"><code>AsyncWrite</code></a> to the
standard library, also in roughly the form they have today in
futures. In short, stable, interoperable traits for reading and writing enables
a whole lot of libraries and middleware. After all, the main reason
people are using async is to do I/O.</p>
<p>In contrast to <a href="https://docs.rs/futures/0.3.4/futures/stream/trait.Stream.html"><code>Stream</code></a>, I do expect this to be controversial, for a
few reasons. But much like <a href="https://docs.rs/futures/0.3.4/futures/stream/trait.Stream.html"><code>Stream</code></a>, I still think it&rsquo;s the right
thing to do, and actually for much the same reasons.</p>
<h3 id="first-concern-about-async-read-uninitialized-memory">First concern about async read: uninitialized memory</h3>
<p>I know of two major concerns about adding <code>AsyncRead</code> and
<code>AsyncWrite</code>.  The first is around <strong>uninitialized memory</strong>. Just like
its synchronous counterpart <a href="https://doc.rust-lang.org/std/io/trait.Read.html"><code>Read</code></a>, the <a href="https://docs.rs/futures/0.3.4/futures/io/trait.AsyncRead.html"><code>AsyncRead</code></a> trait must be
given a buffer where the data will be written. And, just like
<a href="https://doc.rust-lang.org/std/io/trait.Read.html"><code>Read</code></a>, the trait currently requires that this buffer must be zeroed
or otherwise initialized.</p>
<p>You will probably recognize that this is another case of an
&ldquo;orthogonal problem&rdquo;. Both the synchronous and asynchronous traits
have the same issue, and I think the best approach is to try and solve
it in an analogous way. Fortunately, <a href="http://smallcultfollowing.com/babysteps/blog/2020/01/20/async-interview-5-steven-fackler/">sfackler has done just
that</a>. The idea that we discussed in our async interview is
slowly making its way into RFC form.</p>
<p>So, in short, I think uninitialized memory is a &ldquo;solved problem&rdquo;, and
moreover I think it was solved in the right way. Happy days.</p>
<h3 id="second-concern-about-async-read-io_uring">Second concern about async read: io_uring</h3>
<p>This is a relatively new thing, but a new concern about <code>AsyncRead</code>
and <code>AsyncWrite</code> is that, fundamentally, they were designed around
<a href="https://en.wikipedia.org/wiki/Epoll"><code>epoll</code></a>-like interfaces. In these interfaces, you get a callback
when data is ready and then you can go and write that data into a
buffer. But in Linux 5.1 added a new interface, called <code>io_uring</code>, and
it works differently. I won&rsquo;t go into the details here, but boats
gives a <a href="https://boats.gitlab.io/blog/post/iou/">good intro</a> in their blog post introducing the <a href="https://github.com/withoutboats/iou"><code>iou</code></a>
library.</p>
<p>My take here is somewhat similar to my take on why we should not block
streams on GATs: <code>io_uring</code> is super promising, but it&rsquo;s also super
new. We have very little experience trying to build futures atop
<code>io_uring</code>. I think it&rsquo;s great that people are experimenting, and I
think that we should encourage and spread those experiments. After
some time, I expect that &ldquo;best practices&rdquo; will start to emerge, and at
that time, we should try to codify those best practices into traits
that we can add to the standard library.</p>
<p>In the meantime, though, epoll is not going anywhere. There will
always be systems based on epoll that we will want to support, and we
know exactly how to do that, because we&rsquo;ve spend years tinkering with
and experimenting with the <a href="https://docs.rs/futures/0.3.4/futures/io/trait.AsyncRead.html"><code>AsyncRead</code></a> and <a href="https://docs.rs/futures/0.3.4/futures/io/trait.AsyncWrite.html"><code>AsyncWrite</code></a>. It&rsquo;s time
to standardize them and to allow people to build I/O libraries based
on them. Once we know how best to handle <code>io_uring</code>, we&rsquo;ll integrate
that too.</p>
<p>All of that said, I would really like to learn more about <code>io_uring</code>
and what it might mean, since I&rsquo;ve not dug that deeply here. Maybe a
good topic for a future async interview!</p>
<h3 id="looking-further-out">Looking further out</h3>
<p>Looking further out, I think there are some bigger goals that we
should be thinking about. The largest is probably adding some form of
<strong>generator syntax</strong>. Anecdotally, I definitely hear about a fair
number of folks working with streams and encountering difficulties
doing so. As <a href="http://smallcultfollowing.com/babysteps/blog/2020/03/10/async-interview-7-withoutboats/#supporting-generators-iterators-and-async-generators-streams">boats said</a>, writing <code>Stream</code> implementations is a
common reason that people have to interact directly with <code>Pin</code>, and
that&rsquo;s something we want to minimize. Further, in a synchronous
setting, generator syntax would also give us syntactic support for
writing iterators, which would benefit Rust overall. <strong>Enabling
support for async functions in traits</strong> would also be high on my list,
along with <strong>async closures</strong>. (The latter in particular would enable
us to bring in a lot more utility methods and combinators for futures
and streams, which would be great.)</p>
<p>I think though that it&rsquo;s worth waiting a bit before we pursue these, for
several reasons.</p>
<ul>
<li>Generator syntax would build on a <code>Stream</code> trait anyhow, so having
that in the standard libary is an obvious first step.</li>
<li>There is ongoing work on GATs and chalk integration in the context
of wg-traits, and we&rsquo;re making quite rapid progress there. The above
items all potentially interact with GATs in some way, and it&rsquo;d be
nice if we had more of an implementation available before we started
in on them (though it may not be a hard requirement).</li>
<li>Quite frankly, we don&rsquo;t have the bandwidth. We need to work on
building up an effective wg-async-foundations group before we can
take on these sorts of projects. More on this point later.</li>
</ul>
<h3 id="related-and-supporting-efforts">Related and supporting efforts</h3>
<p>There are a few pending features in the language team that I think may be pretty
useful for async applications. I won&rsquo;t go into detail here, but briefly:</p>
<ul>
<li><code>impl Trait</code> everywhere &ndash; finishing up the <code>impl Trait</code> saga will
enable us to encode some cases where async fn in traits might be
nice, such as Tower&rsquo;s <a href="https://docs.rs/tower/0.3.0/tower/trait.Service.html"><code>Service</code></a> trait;</li>
<li>GATs, obviously &ndash; GATs arise around a number of advanced features.</li>
<li>procedural macros &ndash; we&rsquo;ve been making slow and steady progress on
stabilizing bits and pieces of the procedural macro story, and I
think it&rsquo;s a crucial enabler for async-related applications (and
many others). Things like the <code>#[runtime::main]</code> and <code>async-trait</code>
crate are only possible because of the procedural macro
support. Both Carl and Eliza brought up the importance of offering
procedural macros in expression position without requiring things
like <code>proc_macro_hack</code>.</li>
</ul>
<p>I&rsquo;ll write more about these points in other posts, though.</p>
<h3 id="summing-up-the-list">Summing up: the list</h3>
<p>To summarize, here is my list of what I think we should be doing in
&ldquo;async land&rdquo; as our next steps:</p>
<ul>
<li>Continued polish and improvements to the core compiler implementation.</li>
<li>Lints for common &ldquo;gotchas&rdquo;, like <code>#[must_use]</code> to help identify &ldquo;not yield safe&rdquo; types.</li>
<li>Extend the stdlib with mutexes, channels, <a href="http://smallcultfollowing.com/babysteps/blog/2020/03/10/async-interview-7-withoutboats/#block_on-in-the-std-library"><code>task::block_on</code></a>, and other small utilities.</li>
<li>Extend the <code>Drop</code> trait with &ldquo;lifecycle&rdquo; methods (&ldquo;async drop&rdquo;).</li>
<li>Add <code>Stream</code>, <code>AsyncRead</code>, and <code>AsyncWrite</code> traits to the standard library.</li>
</ul>
<p>To be clear, this is a <strong>proposal</strong>, and I am very much interested in
feedback on it, and I wouldn&rsquo;t surprised to add or remove a thing or
two. However, it&rsquo;s not an arbitrary proposal: It&rsquo;s a proposal that
I&rsquo;ve given a fair amount of thought to, and I feel reasonably certain
about it.</p>
<p>There are a few things I&rsquo;d be particularly interested to <a href="https://users.rust-lang.org/t/async-interviews/35167/">get feedback</a> on:</p>
<ul>
<li>If you maintain a library, what are some of the challenges you&rsquo;ve
encountered in making it operate generically across executors? What
could help there?</li>
<li>Do you have ideas for useful bits of polish? Are there small changes or stdlib
additions that would make everyday life that much easier?</li>
</ul>
<h3 id="a-challenge-growing-an-effective-working-group">A challenge: growing an effective working group</h3>
<p>I want to close with a few comments on organization. One of the things
we&rsquo;ve been trying to figure out is how best to organize ourselves and
create a sustainable working group.</p>
<p>Thus far, <a href="https://github.com/tmandry">tmandry</a> has been doing a great job at organizing the
polish work that has been our focus, and I think we&rsquo;ve been making
good progress there, although there&rsquo;s always a need for more folks to
help out. (Shameless plug: <a href="https://github.com/rust-lang/wg-async-foundations#getting-involved">Here are some tips for how to get
involved</a>!)</p>
<p><strong>If we want to go beyond polish and get back to adding things to the
standard library, especially things like the <code>Stream</code> or <code>AsyncRead</code>
trait, we&rsquo;re going to have to up our game.</strong> The same is true for some
of the more diverse tasks that fall under our umbrella, such as
maintaining the <a href="https://rust-lang.github.io/async-book/index.html">async book</a>.</p>
<p>To do those tasks, we&rsquo;re going to need <a href="http://smallcultfollowing.com/babysteps/blog/2019/04/15/more-than-coders/">more than coders</a>. We need to
take the time to draft designs, incorporate feedback, write the RFCs,
and push things through to stabilization.</p>
<p>To be honest, I&rsquo;m not entirely sure where that work is going to come
from &ndash; but I believe we can do it! If this is something you&rsquo;re
interested in, definitely drop in the <code>#wg-async-foundations</code> stream
on Zulip and say hello, and monitor the <a href="https://blog.rust-lang.org/inside-rust/">Inside Rust</a>, as I expect
we&rsquo;ll be posting updates there from time to time.</p>
<h3 id="comments">Comments?</h3>
<p>As always, please leave comments in the <a href="https://users.rust-lang.org/t/async-interviews/35167/">async interviews thread</a>
on <code>users.rust-lang.org</code>.</p>
<h3 id="footnotes">Footnotes</h3>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>I couldn&rsquo;t resist.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/async" term="async" label="Async"/></entry><entry><title type="html">Library-ification and analyzing Rust</title><link href="https://smallcultfollowing.com/babysteps/blog/2020/04/09/libraryification/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2020/04/09/libraryification/</id><published>2020-04-09T00:00:00+00:00</published><updated>2020-04-09T00:00:00+00:00</updated><content type="html"><![CDATA[<p>I&rsquo;ve noticed that the ideas that I post on my blog are getting much
more &ldquo;well rounded&rdquo;. That is a problem. It means I&rsquo;m waiting too long
to write about things. So I want to post about something that&rsquo;s a bit
more half-baked &ndash; it&rsquo;s an idea that I&rsquo;ve been kicking around to
create a kind of informal &ldquo;analysis API&rdquo; for rustc.</p>
<h3 id="the-problem-statement">The problem statement</h3>
<p>I am interested in finding better ways to support advanced analyses
that &ldquo;layer on&rdquo; to rustc. I am thinking of projects like <a href="https://www.pm.inf.ethz.ch/research/prusti.html">Prusti</a> or
Facebook&rsquo;s <a href="https://github.com/facebookexperimental/MIRAI">MIRAI</a>, or even the venerable <a href="https://github.com/rust-lang/rust-clippy">Clippy</a>. All of these
projects are attempts to layer on additional analyses atop Rust&rsquo;s
existing type system that prove useful properties about your code.
<a href="https://www.pm.inf.ethz.ch/research/prusti.html">Prusti</a>, for example, lets you add pre- and post-conditions to your
functions, and it will prove that they hold.</p>
<h3 id="in-theory-rust-is-a-great-fit-for-analysis">In theory, Rust is a great fit for analysis</h3>
<p>There has been a trend lately of trying to adapt existing tools build
initially for other languages to analyze Rust. <a href="https://www.pm.inf.ethz.ch/research/prusti.html">Prusti</a>, for example,
is adapting an existing project called <a href="https://www.pm.inf.ethz.ch/research/viper.html">Viper</a>, which was built to
analyze languages like C# or Java. However, actually analyzing
programs written in C# or Java <em>in practice</em> is often quite difficult,
precisely because of the kinds of pervasive, mutable aliasing that
those languages encourage.</p>
<p>Pervasive aliasing means that if you see code like</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-java" data-lang="java"><span class="line"><span class="cl"><span class="n">a</span><span class="p">.</span><span class="na">setCount</span><span class="p">(</span><span class="n">0</span><span class="p">);</span><span class="w">
</span></span></span></code></pre></div><p>it can be quite difficult to be sure whether that call might also
modify the state of some variable <code>b</code> that happens to be floating
around. If you are trying to enforce contracts like &ldquo;in order to call
this method, the count must be greater than zero&rdquo;, then it&rsquo;s important
to know which variables are affected by calls like <code>setCount</code>.</p>
<p>Rust&rsquo;s ownership/borrowing system can be really helpful here. The
borrow checker rules ensure that it&rsquo;s fairly easy to see what data a
given Rust function might read or mutate. This is of course the key to
how Rust is able to steer you away from data races and segmentation
faults &ndash; but the key insight here is that those same properties can
also be used to make higher-level correctness guarantees. Even better,
many of the more complex analyses that analysis tools might need &ndash;
e.g., alias analysis &ndash; map fairly well onto what the Rust compile
already does.</p>
<h3 id="in-practice-analyzing-rust-is-a-pain-but-not-because-of-the-language">In practice, analyzing Rust is a pain, but not because of the language</h3>
<p>Unfortunately, while Rust ought to be a great fit for analysis tools,
it&rsquo;s a horrible pain to try and implement such a tool <strong>in practice</strong>.
The problem is that there is lots of information that is needed to do
this sort of analysis, and that information is not readily accessible.
I&rsquo;m thinking of information like the types of expressions or the kind
of aliasing information that the borrow check gathers. <a href="https://www.pm.inf.ethz.ch/research/prusti.html">Prusti</a>, for
example, has to resort to reading the debug output from the borrow
checker and trying to reconstitute what is going on.</p>
<p>Ideally, I think what we would want is some way for analyzer tools to
leverage the compiler itself. They ought to be able to use the
compiler to do the parsing of Rust code, to run the borrow check, and
to construct MIR. They should then be able to access the <a href="https://rustc-dev-guide.rust-lang.org/mir/index.html">MIR</a> and the
accompanying borrow check results and use that to construct their own
internal IRs (in practice, virtually all such verifiers would prefer
to start from an abstraction level like MIR, and not from a raw Rust
AST). They should be able to ask the compiler for information about
the layout of data structures in memory and other things they might
need, too, or for information about the type signature of other
methods.</p>
<h3 id="enter-on-demand-analysis-and-library-ification">Enter: on-demand analysis and library-ification</h3>
<p>A few years back, the idea of enabling analysis tools to interact with
the compiler and request this sort of detailed information would have
seemed like a fantasy. But the architectural work that we&rsquo;ve been
doing lately is actually quite a good fit for this use case.</p>
<p>I&rsquo;m referring to two different trends:</p>
<ul>
<li>on-demand analysis</li>
<li>library-ification</li>
</ul>
<h3 id="the-first-trend-on-demand-analysis">The first trend: On-demand analysis</h3>
<p>On-demand analysis is basically the idea that we should structure the
compiler&rsquo;s internal core into a series of &ldquo;queries&rdquo;. Each query is a
pure function from some inputs to an output, and it might be something
like &ldquo;parse this file&rdquo; (yielding an AST) or &ldquo;type-check this function&rdquo;
(yielding a set of errors). The key idea is that each query can in
turn invoke other queries, and thus execution begins from the <em>end
state</em> that we want to reach (&ldquo;give me an executable&rdquo;) and works its
way <em>backwards</em> to the first few steps (&ldquo;parse this file&rdquo;). This winds
up <a href="https://blog.rust-lang.org/2016/09/08/incremental.html">fitting quite nicely with incremental computation</a> as
well as parallel execution. (If you&rsquo;d like to learn more about this, I
gave a talk at <a href="https://pliss.org/">PLISS</a> that is <a href="https://www.youtube.com/watch?v=N6b44kMS6OM">available on YouTube</a>.)</p>
<p>On-demand analysis is also a great fit for IDEs, since it allows us to
do &ldquo;just as much work&rdquo; as we have to&quot; in order to figure out key bits
of information (e.g., &ldquo;what is the type of the expression at the
cursor&rdquo;). The rust-analyzer project is based entirely on on-demand
computation, using the <a href="https://github.com/salsa-rs/salsa">salsa</a> library.</p>
<h3 id="on-demand-analysis-is-a-good-fit-for-analysis-tools">On-demand analysis is a good fit for analysis tools</h3>
<p>On-demand analysis is not only a good fit for IDEs: it&rsquo;d be a great
fit for tools like <a href="https://www.pm.inf.ethz.ch/research/prusti.html">Prusti</a>. If we had a reasonably stable API, tools
like <a href="https://www.pm.inf.ethz.ch/research/prusti.html">Prusti</a> could use on-demand analysis to ask for just the results
they need. For example, if they are analyzing a particular function,
they might ask for the borrow check results. In fact, if we did it
right, they could also leverage the same incremental compilation
caches that the compiler is using, which would mean that they don&rsquo;t
even have to re-parse or recompute results that are already available
from a previous build (or, conversly, upcoming builds can re-use
results that <a href="https://www.pm.inf.ethz.ch/research/prusti.html">Prusti</a> computed when doing its analysis).</p>
<h3 id="the-second-trend-library-ification">The second trend: Library-ification</h3>
<p>There is a second trend in the compiler, one that&rsquo;s only just begun,
but one that I hope will transform the way rustc development feels by
the time it&rsquo;s done. We call it &ldquo;library-ification&rdquo;. The basic idea is
to refactor the compiler into a set of <em>independent libraries</em>, all
knit together by the query system.</p>
<p>One of the immediate drivers for library-ification is the desire to
integrate [rust-analyzer] and rustc into one coherent codebase. Right
now, the [rust-analyzer] IDE is basically a re-implementation of the
front-end of the Rust compiler. It has its own parser, its own name
resolver, and its own type-checker.</p>
<h3 id="the-vision-shared-components">The vision: shared components</h3>
<p>So we saw that, presently, rust-analyzer is effectively a
re-implementation of many parts of the the Rust compiler. But it&rsquo;s
also interesting to look at what rust-analyzer does <strong>not</strong> have &ndash;
its own trait system. rust-analyzer uses the [chalk] library to handle
its trait system. And, of course, work is <a href="https://blog.rust-lang.org/inside-rust/2020/03/28/traits-sprint-1.html">also underway</a> to integrate
chalk into rustc.</p>
<p>At the moment, chalk is a promising but incomplete project. But if it
works as well as I hope, it points to a promising possibility. We can
have the &ldquo;trait solver&rdquo; as a coherent block of functionality that is
shared by multiple projects. And we could go further, so that we wind
up with rustc and rust-analyzer being just two &ldquo;small shims&rdquo; over top
the same core packages that make up the compiler. One shim would
export those packages in a &ldquo;batch compilation&rdquo; format suitable for use
by cargo, and one as a LSP server suitable for use by IDEs.</p>
<h3 id="the-vision-clean-apis-defined-in-terms-of-rust-concepts">The vision: Clean APIs defined in terms of Rust concepts</h3>
<p>Chalk is interesting for another reason, too. The API that Chalk
offers is based around core concepts and should, I think, be fairly
stable. For example, it communicates with the compiler via a trait,
the <a href="http://rust-lang.github.io/chalk/chalk_solve/trait.RustIrDatabase.html"><code>RustIrDatabase</code></a>, that allows it to query for specific bits of
information about the Rust source (e.g., <a href="http://rust-lang.github.io/chalk/chalk_solve/trait.RustIrDatabase.html#tymethod.impl_datum">&ldquo;tell me about this
impl&rdquo;</a>), and doesn&rsquo;t require a full AST or lots of
specifics from its host. One of the benefits of this is that we can
have a relatively simple testing harness that lets us write <a href="https://github.com/rust-lang/chalk/blob/73a74be3bc1d0cdef3f76fa529a112a0d8367ddb/tests/test/impls.rs#L9-L22">chalk
unit tests</a> in a simplified form of Rust syntax.</p>
<p>The fact that chalk&rsquo;s unit tests are &ldquo;mini Rust programs&rdquo; is nice
because they&rsquo;re readable, but it&rsquo;s important a deeper reason,
too. I&rsquo;ve many times experienced problems when using unit tests where
the tests wind up tied very tightly to the structure of the code, and
hence big swaths of tests get invalidated when doing refactoring, and
it&rsquo;s often quite hard to port them to the new interface. We don&rsquo;t
generally have to worry about this with rustc, since its tests are
just example programs &ndash; and the same is true for Chalk, by and large.
My sense is that one of the ways that we will know where good library
boundaries lie will be our ability to write unit tests in a clear way.</p>
<h3 id="library-ification-can-help-make-rustc-more-accessible">Library-ification can help make rustc more accessible</h3>
<p>Right now, many folks have told me that the rustc code base can be
quite intimidating. There&rsquo;s a lot of code. It takes a while to build
and requires some custom setup to get things going (not to mention
gobs of RAM). Although, like any large code-base, it is factored into
several relatively independent modules, it&rsquo;s not always obvious where
the boundaries between those modules are, so it&rsquo;s hard to learn it a
piece at a time.</p>
<p>But imagine instead that rustc was composed of a relatively small
number of well-defined libraries, with clear and well-documented APIs
that separated them. Those libraries might be in separate repositories
and they might not, but regardless you could jump into a single
library and start working. It would have a clear API that connects it
to the rest of the compiler, and a testing harness that lets you run
unit tests that exercise that API (along of course with our existing
suite of example programs, which serve as integration tests).</p>
<p>The benefits of course aren&rsquo;t limited to new contributors. I really
enjoy hacking on chalk because it&rsquo;s a relatively narrow and pliable
code base.  It&rsquo;s easy to jump from place to place and find what I&rsquo;m
looking for. In contrast, working on rustc feels much more difficult,
even though I know the codebase quite well.</p>
<h3 id="library-ification-will-work-best-if-apis-arent-changing">Library-ification will work best if APIs aren&rsquo;t changing</h3>
<p>One thing I want to emphasize. I think that this whole scheme will
work best if we can find interfaces between components that are not
changing all the time. Frequently changing interfaces would indicate
that the modules between the compiler are coupled in ways we&rsquo;d prefer
to avoid, and it will make it harder for people to work within one
library without having to learn the details of the others.</p>
<h3 id="libaries-could-be-used-by-analysis-tools-as-well">Libaries could be used by analysis tools as well</h3>
<p>Now we come to the final step. If we imagine that we are able to
subdivide rustc into coherent libraries, and that those libraries have
relatively clean, stable APIs betwen them, then it is also plausible
that we can start publishing those libraries on crates.io (or perhaps
wrappers around them, with simplified and more limited APIs). This
then starts to look sort of like the <a href="https://github.com/dotnet/roslyn">.NET Roslyn compiler</a> &ndash; we are
exporting the tools to help people analyze and understand Rust code
for themselves. So, for example, <a href="https://www.pm.inf.ethz.ch/research/prusti.html">Prusti</a> could invoke rustc&rsquo;s borrow
checker and read its results directly, without having to resort to
elaborate hacks.</p>
<h3 id="on-stability-and-semver">On stability and semver</h3>
<p>I&rsquo;ve tossed out the term &ldquo;stable&rdquo; a few times throughout this post, so
it&rsquo;s worth putting in a few words for how I think stability would work
if we went down this direction. <strong>I absolutely do not think we would
want to commit to some kind of fixed, unchanging API for rustc or
libraries used by rustc.</strong> In fact, in the early days, I imagine we&rsquo;d
just publish a new major version of each library with each Rust
release, which would imply that you&rsquo;d have to do frequent updates.</p>
<p>But once the APIs settle down &ndash; and, as I wrote, I really hope that
they do &ndash; I think we would simply want to have meaningful semver,
like any other library.  In other words, we should always feel free to
make breaking changes to our APIs, but we should announce when we do
so, and I hope that we don&rsquo;t have to do so frequently.</p>
<p>If this all <strong>really</strong> works out, I imagine we&rsquo;d start to think about
scheduling breaking changes in APIs, or finding alternatives that let
us keep tooling working. I think that&rsquo;d be a fine price to pay in
exchange for having a host of powerful tooling available, but in any
case it&rsquo;s quite far away.</p>
<h3 id="conclusion">Conclusion</h3>
<p>This post sketches out my vision for how Rust compiler development in
the long term. I&rsquo;d like to see a rustc based on a relatively small
number of well-defined components that encapsulate major chunks of
functionality, like &ldquo;the trait system&rdquo;, &ldquo;the borrow checker&rdquo;, or &ldquo;the
parser&rdquo;. In the short term, these components should allow us to share
code between rustc and rust-analyzer, and to make rustc more
understandable. In the longer term, these components could even enable
us to support a broad ecosystem of compiler tools and analyses.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/></entry><entry><title type="html">Async Interview #7: Withoutboats</title><link href="https://smallcultfollowing.com/babysteps/blog/2020/03/10/async-interview-7-withoutboats/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2020/03/10/async-interview-7-withoutboats/</id><published>2020-03-10T00:00:00+00:00</published><updated>2020-03-10T00:00:00+00:00</updated><content type="html"><![CDATA[<p>Hello everyone! I&rsquo;m happy to be posting a transcript of my <a href="http://smallcultfollowing.com/babysteps/blog/2019/11/22/announcing-the-async-interviews/">async
interview</a> with withoutboats. This particularly interview took place
way back on January 14th, but the intervening months have been a bit
crazy and I didn&rsquo;t get around to writing it up till now.</p>
<h3 id="video">Video</h3>
<p>You can watch the <a href="https://youtu.be/a-kZhPMqXRs">video</a> on YouTube. I&rsquo;ve also embedded a copy here
for your convenience:</p>
<center><iframe width="560" height="315" src="https://www.youtube.com/embed/a-kZhPMqXRs" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe></center>
<h2 id="next-steps-for-async">Next steps for async</h2>
<p>Before I go into boats&rsquo; interview, I want to talk a bit about the
state of async-await in Rust and what I see as the obvious next steps.
I may still do a few more async interviews after this &ndash; there are
tons of interesting folks I never got to speak to! &ndash; but I think it&rsquo;s
also past time to try and come to a consensus of the &ldquo;async roadmap&rdquo;
for the rest of the year (and maybe some of 2021, too). The good news
is that I feel like the async interviews highlighted a number of
relatively clear next steps. Sometime after this post, I hope to post
a blog post laying out a &ldquo;rough draft&rdquo; of what such a roadmap might
look like.</p>
<h2 id="history">History</h2>
<p>withoutboats is a member of the Rust lang team. Starting around the
beginning on 2018, they started looking into async-await for
Rust. Everybody knew that we wanted to have some way to write a
function that could suspend (<code>await</code>) as needed. But we were stuck on
a rather fundamental problem which boats explained in the blog post
<a href="https://boats.gitlab.io/blog/post/2018-01-25-async-i-self-referential-structs/">&ldquo;self-referential structs&rdquo;</a>. This blog post was the first in a
series of posts that ultimately documented the design that became the
<a href="https://doc.rust-lang.org/std/pin/struct.Pin.html"><code>Pin</code></a> type, which describes a pointer to a value that can never be
moved to another location in memory. <code>Pin</code> became the foundation for
async functions in Rust. (If you&rsquo;ve not read the blog post series,
it&rsquo;s highly recommended.) If you&rsquo;d like to learn more about pin, boats
posted a <a href="https://www.youtube.com/watch?v=shtfSMTwKRw">recorded stream on YouTube</a> that explores its design in
detail.</p>
<h2 id="vision-for-async">Vision for async</h2>
<p>All along, boats has been motivated by a relatively clear vision: we
should make async Rust &ldquo;just as nice to use&rdquo; as Rust with blocking
I/O. In short, you should be able to write code much like you ever
did, but adding making functions which perform I/O into <code>async</code> and
then adding <code>await</code> here or there as needed.</p>
<p>Since 2018, we&rsquo;ve made great progress towards the goal of &ldquo;async I/O
that is as easy as sync&rdquo; &ndash; most notably by landing and <a href="https://blog.rust-lang.org/2019/11/07/Async-await-stable.html">stabilizing
the async-await MVP</a> &ndash; but we&rsquo;re not there yet. There remain a
number of practical obstacles that make writing code using async I/O
more difficult than sync I/O. So the mission for the next few years is
to identify those obstacles and dismantle them, one by one.</p>
<h2 id="next-step-async-destructors">Next step: async destructors</h2>
<p>One of the first obstacles that boats mentioned was extending Rust&rsquo;s
<a href="https://doc.rust-lang.org/std/ops/trait.Drop.html"><code>Drop</code></a> trait to work better for async code. The <a href="https://doc.rust-lang.org/std/ops/trait.Drop.html"><code>Drop</code></a> trait, for
those who don&rsquo;t know Rust, is a special trait in Rust that types can
implement in order to declare a destructor (code which should run when
a value goes out of scope). boats wrote a <a href="https://boats.gitlab.io/blog/post/poll-drop/">blog
post</a> that discusses the
problem in more detail and proposes a solution. Since that blog post,
they&rsquo;ve refined the proposal in response to some feedback, though the
overall shape remains the same. The basic idea is to extend the <code>Drop</code>
trait with an optional <code>poll_drop_ready</code> method:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="nb">Drop</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">drop</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">poll_drop_ready</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span>: <span class="nc">Pin</span><span class="o">&lt;&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">Self</span><span class="o">&gt;</span><span class="p">,</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">ctx</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">Context</span><span class="o">&lt;</span><span class="nb">&#39;_</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Poll</span><span class="o">&lt;</span><span class="p">()</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">Poll</span>::<span class="n">Ready</span><span class="p">(())</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>When executing an async fn, and a value goes out of scope, we will
first invoke <code>poll_drop_ready</code>, and &ldquo;await&rdquo; if it returns anything
other than <code>Poll::Ready</code>. This gives the value a chance to do async
operations that may block, in preparation for the final drop.  Once
<code>Poll::Ready</code> is returned, the ordinary <code>drop</code> method is invoked.</p>
<p>This async-drop trait came up in early async interviews, and I raised
<a href="http://smallcultfollowing.com/babysteps/blog/2020/02/11/async-interview-6-eliza-weisman/#raii-and-async-fn-doesnt-always-play-well">Eliza&rsquo;s use case</a> with boats. Specifically, she wanted some way to
offer values that are live on the stack a callback when a yield occurs
and when the function is resumed, so that they can (e.g.) interact
with thread-local state correctly in an async context. While distinct
from async destructors, the issues are related because destructors are
often used to manage thread-local values in a scoped fashion.</p>
<p>Adding async drop requires not only modifying the compiler but also
modifying futures combinators to properly handle the new
<code>poll_drop_ready</code> method (combinators need to propagate this
<code>poll_drop_ready</code> to the sub-futures they contain).</p>
<p>Note that we wouldn&rsquo;t offer any &lsquo;guarantee&rsquo; that <code>poll_drop_ready</code>
will run. For example, it would not run if a future is dropped without
being resumed, because then there is no &ldquo;async context&rdquo; that can
handle the awaits. However, like <code>Drop</code>, it would ultimately be
something that types can &ldquo;usually&rdquo; expect to execute under ordinary
circumstances.</p>
<p>Some of the use cases for async-drop include writers that buffer data
and wish to ensure that the data is flushed out when the writer is
dropped, transactional APIs, or anything that might do I/O when
dropped.</p>
<h2 id="block_on-in-the-std-library"><code>block_on</code> in the std library</h2>
<p>One very small addition that boats proposed is adding <code>block_on</code> to
the standard library. Invoking <code>block_on(future)</code> would block the
current thread until <code>future</code> has been fully executed (and then return
the resulting value). This is actually something that most async I/O
code would <em>never</em> want to do &ndash; if you want to get the value from a
future, after all, you should do <code>future.await</code>. So why is <code>block_on</code> useful?</p>
<p>Well, <code>block_on</code> is basically the most minimal executor. It allows you
to take async code and run it in a synchronous context with minimal
fuss. It&rsquo;s really convenient in examples and documentation. I would
personally like it to permit writing stand-alone test cases. Those
reasons alone are probably good enough justification to add it, but
boats has another use in mind as well.</p>
<h2 id="async-fn-main">async fn main</h2>
<p>Every Rust program ultimately begins with a <code>main</code> somewhere. Because
<code>main</code> is invoked by the surrounding C library to start the program,
it also tends to be a place where a certain amount of &ldquo;boilerplate
code&rdquo; can accumulate in order to &ldquo;setup&rdquo; the environment for the rest
of the program. This &ldquo;boilerplate setup&rdquo; can be particularly annoying
when you&rsquo;re just getting started with Rust, as the <code>main</code> function is
often the first one you write, and it winds up working differently
than the others. A similar program effects smaller code examples.</p>
<p>In Rust 2018, we extended <code>main</code> so that it supports <code>Result</code> return
values. This meant that you could now write <code>main</code> functions that use
the <code>?</code> operator, without having to add some kind of intermediate
wrapper:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;</span><span class="p">(),</span><span class="w"> </span><span class="n">std</span>::<span class="n">io</span>::<span class="n">Error</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">file</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">std</span>::<span class="n">fs</span>::<span class="n">File</span>::<span class="n">create</span><span class="p">(</span><span class="s">&#34;output.txt&#34;</span><span class="p">)</span><span class="o">?</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Unfortunately, async code today suffers from a similar papercut.  If
you&rsquo;re writing an async project, most of your code is going to be
async in nature: but the <code>main</code> function is always synchronous, which
means you need to bridge the two somehow. Sometimes, especially for
larger projects, this isn&rsquo;t that big a deal, as you likely need to do
some setup or configuration anyway. But for smaller examples, it&rsquo;s
quite a pain.</p>
<p>So boats would like to allow people to write an &ldquo;async&rdquo; main. This
would then permit you to directly &ldquo;await&rdquo; futures from within the
<code>main</code> function:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">load_data</span><span class="p">(</span><span class="mi">22</span><span class="p">).</span><span class="k">await</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">load_data</span><span class="p">(</span><span class="n">port</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Data</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Of course, this raises the question: since the program will ultimately
run synchronized, how do we bridge from the <code>async fn main</code> to a
synchronous main? This is where <code>block_on</code> comes in: at least to
start, we can simply declare that the future generated by <code>async fn main</code> will be executed using <code>block_on</code>, which means it will block the
main thread until <code>main</code> completes (exactly what we want). For simple
programs and examples, this will be exactly what you want.</p>
<p>But most real programs will ultimately want to start some other
executor to get more features. In fact, <a href="https://github.com/rustasync/runtime#attributes">following the lead of the
runtime crate</a>, many executors already offer a procedural macro
that lets you write an async main. So, for example, <a href="https://tokio.rs/">tokio</a> and
<a href="https://async.rs/">async-std</a> offer attributes called <a href="https://book.async.rs/tutorial/accept_loop.html"><code>#[tokio::main]</code></a> and
<a href="https://docs.rs/async-std/1.5.0/async_std/#examples"><code>#[async_std::main]</code></a> respectively, which means that if you have an
<code>async fn main</code> program you can pick an executor just by adding the
appropriate attribute:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cp">#[tokio::main]</span><span class="w"> </span><span class="c1">// or #[async_std::main], etc
</span></span></span><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>I imagine that other executors offer a similar procedural macro &ndash; or
if they don&rsquo;t yet, they could add one. =)</p>
<p>(In fact, since async-std&rsquo;s runtime starts implicitly in a background
thread when you start using it, you could use async-std libraries
without any additional setup as well.)</p>
<p>Overall, this seems pretty nice to me. Basically, when you write
<code>async fn main</code>, you get Rust&rsquo;s &ldquo;default executor&rdquo;, which presently is
a <em>very</em> bare-bones executor suitable only for simple examples. To
switch to a more full-featured executor, you simply add a
<code>#[foo::main]</code> attribute and you&rsquo;re off to the races!</p>
<p>(Side note #1: This isn&rsquo;t something that boats and I talked about, but
I wonder about adding a more general attribute, like
<code>#[async_runtime(foo)]</code> that just desugars to a call like
<code>foo::main_wrapper(...)</code>, which is expected to do whatever setup is
appropriate for the crate <code>foo</code>.)</p>
<p>(Side note #2: This <em>also</em> isn&rsquo;t something that boats and I talked
about, but I imagine that having a &ldquo;native&rdquo; concept of <code>async fn main</code>
might help for some platforms where there is already a native
executor. I&rsquo;m thinking of things like <a href="https://gstreamer.freedesktop.org/">GStreamer</a> or perhaps iOS with
Grand Central Dispatch. In short, I imagine there are environments
where the notion of a &ldquo;main function&rdquo; isn&rsquo;t really a great fit anyhow,
although it&rsquo;s possible I have no idea what I&rsquo;m talking about.)</p>
<h2 id="async-await-in-an-embedded-context">async-await in an embedded context</h2>
<p>One thing we&rsquo;ve not talked about very much in the interviews so far is
using async-await in an embedded context. When we shipped the
async-await MVP, we definitely cut a few corners, and one of those had
to do with the use of thread-local storage (TLS). Currently, when you
use <code>async fn</code>, the desugaring winds up using a private TLS variable
to carry the <a href="https://doc.rust-lang.org/std/task/struct.Context.html"><code>Context</code></a> about the current async task down through the
stack. This isn&rsquo;t necessary, it was just a quick and convenient hack
that sidestepped some questions about how to pass in arguments when
resuming a suspended function. For most programs, TLS works just fine,
but some embedded environments don&rsquo;t support it. Therefore, it makes
sense to fix this bug and permit <code>async fn</code> to pass around its state
without the use of TLS. (In fact, since boats and I talked,
<a href="https://github.com/rust-lang/rust/pull/69033">jonas-schievink</a> opened PR <a href="https://github.com/rust-lang/rust/pull/69033">#69033</a> which does exactly this, though
it&rsquo;s not yet landed.)</p>
<h2 id="async-fn-are-implemented-using-a-more-general-generator-mechanism">Async fn are implemented using a more general generator mechanism</h2>
<p>You might be surprised when I say that we&rsquo;ve already started fixing
the TLS problem. After all, <strong>the reason we used TLS in the first
place is that there were unresolved questions about how to pass in
data when waking up a suspended function &ndash; and we haven&rsquo;t resolved
those problems</strong>. So why are we able to go ahead and use them to
support TLS?</p>
<p>The answer is that, while the <code>async fn</code> feature is implemented atop a
more general mechanism of suspendable functions<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>, the full power
of that mechanism is not exposed to end-users. So, for example,
suspendable functions in the compiler permit yielding arbitrary
values, but async functions always yield up <code>()</code>, since they only need
to signal that they are blocked waiting on I/O, not transmit
values. Similarly, the compiler&rsquo;s internal mechanism will allow us to
pass in a new <a href="https://doc.rust-lang.org/std/task/struct.Context.html"><code>Context</code></a> when we wake up from a yield, and we can use
that mechanism to pass in the <a href="https://doc.rust-lang.org/std/task/struct.Context.html"><code>Context</code></a> argument from the future
API. But this is hidden from the end-user, since that <a href="https://doc.rust-lang.org/std/task/struct.Context.html"><code>Context</code></a> is
never directly exposed or accessed.</p>
<p>In short, the suspended functions supported by the compiler are not a
language feature: they are an implementation detail that is
(currently) only used for async-await. This is really useful because
it means we can change how they work, and it also means that we don&rsquo;t
have to make them support all possible use cases one might want. In
this particular case, it means we don&rsquo;t have to resolve some of the
thorny questions about to pass in data after a yield, because we only
need to use them in a very specific way.</p>
<h2 id="supporting-generators-iterators-and-async-generators-streams">Supporting generators (iterators) and async generators (streams)</h2>
<p>One observation that boats raised is that people who write Async I/O
code are interacting with <code>Pin</code> much more directly than was expected.
The primary reason for this is that people are having to manually
implement the <a href="https://docs.rs/futures/0.3.1/futures/stream/trait.Stream.html"><code>Stream</code></a> trait, which is basically the async version
of an iterator. (We&rsquo;ve talked about <code>Stream</code> in a number of previous
async interviews.) I have also found that, in my conversations with
<em>users</em> of async, streams come up very, very often. At the moment,
<em>consuming</em> streams is generally fairly easy, but <em>creating</em> them is
quite difficult. For that matter, even in synchronous Rust, manually
implementing the <code>Iterator</code> traits is kind of annoying (although
significantly easier than streams).</p>
<p>So, it would be nice if we had some way to make it easier to write
iterators and streams. And, indeed, this design space has been carved
out in other languages: the basic mechanism is to add a
<strong>generator</strong><sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>, which is some sort of function that can yield up
a series of values before terminating. Obviously, if you&rsquo;ve read up to
this point, you can see that the &ldquo;suspendable functions&rdquo; we used to
implement async await can also be used to support some form of
generator abstractions, so a lot of the hard implementation work has
been done here.</p>
<p>That said, support generator functions has been something that we&rsquo;ve
been shying away from. And why is that, if a lot of the implementation
work is done? The answer is primarily that the design space is
<strong>huge</strong>. I alluded to this earlier in talking about some of the
questions around how to pass data in when resuming a suspended
function.</p>
<h2 id="full-generality-considered-too-dang-difficult">Full generality considered too dang difficult</h2>
<p>boats however contends that we are making our lives harder than they
need to be. In short, <strong>if we narrow our focus from &ldquo;create the
perfect, flexible abstraction for suspended functions and coroutines&rdquo;
to &ldquo;create something that lets you write iterators and streams&rdquo;, then
a lot of the thorny design problems go away</strong>. Now, under the covers,
we still want to have some kind of unified form of suspended functions
that can support async-await and generators, but that is a much
simpler task.</p>
<p>In short, we would want to permit writing a <code>gen fn</code> (and <code>async gen fn</code>), which would be some function that is able to <code>yield</code> values and
which eventually returns. Since the iterator&rsquo;s <code>next</code> method doesn&rsquo;t
take any arguments, we wouldn&rsquo;t need to support passing data in after
yields (in the case of streams, we <em>would</em> pass in data, but only the
<a href="https://doc.rust-lang.org/std/task/struct.Context.html"><code>Context</code></a> values that are not directly exposed to users). Similarly,
iterators and streams don&rsquo;t produce a &ldquo;final value&rdquo; when they&rsquo;re done,
so these functions would always just return unit.</p>
<p>Adopting a more narrow focus wouldn&rsquo;t close the door to exposing our
internal mechanism as a first-class language feature at some point,
but it would help us to solve urgent problems sooner, and it would
also give us more experience to use when looking again at the more
general task. It also means that we are adding features that makes
writing iterators and streams <em>as easy as we can make it</em>, which is a
good thing<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>. (In case you can&rsquo;t tell, I was sympathetic to
boats&rsquo; argument.)</p>
<h2 id="extending-the-stdlib-with-some-key-traits">Extending the stdlib with some key traits</h2>
<p>boats is in favor of adding the &ldquo;big three&rdquo; traits to the standard library
(if you&rsquo;ve been reading these interviews, these traits will be quite
familiar to you by now):</p>
<ul>
<li><code>AsyncRead</code></li>
<li><code>AsyncWrite</code></li>
<li><code>Stream</code></li>
</ul>
<h2 id="stick-to-the-core-vision-async-and-sync-should-be-analogous">Stick to the core vision: Async and sync should be analogous</h2>
<p>One important point: boats believes (and I agree) that we should try
to maintain the principle that the async and synchronous versions of
the traits should align as closely as possible. This matches the
overarching design vision of minimizing the differences between &ldquo;async
Rust&rdquo; and &ldquo;sync Rust&rdquo;. It also argues in favor of the proposal that
<a href="http://smallcultfollowing.com/babysteps/blog/2020/01/20/async-interview-5-steven-fackler/">sfackler proposed in their interview</a>, where we address the
questions of how to handle uninitialized memory in an analogous way
for both <code>Read</code> and <code>AsyncRead</code>.</p>
<p>We talked a bit about the finer details of that principle. For
example, if we were to extend the <code>Read</code> trait with some kind of <code>read_buf</code> method (which can support an uninitialized output buffer), then this
new method would have to have a default, for backwards compatibility reasons:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Read</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">read</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="o">..</span><span class="p">.);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">read_buf</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">buf</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">BufMut</span><span class="o">&lt;..&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This is a bit unfortunate, as ideally you would only implement
<code>read_buf</code>.  For <code>AsyncRead</code>, since the trait doesn&rsquo;t exist yet, we
could switch the defaults. But boats pointed out that this carries
costs too: we would forever have to explain why the two traits are
different, for example. (Another option is to have both methods
default to one another, so that you can implement either one, which &ndash;
combined with a lint &ndash; might be the best of both worlds.)</p>
<h2 id="generic-interface-for-spawning">Generic interface for spawning</h2>
<p>Some time back, boats wrote a post proposing <a href="https://boats.gitlab.io/blog/post/global-executors/">global executors</a>.  This
would basically be a way to add a function to the stdlib to spawn a
task, which would then delegate (somehow) to whatever executor you are
using. Based on the response to the post, boats now feels this is
probably not a good short-term goal.</p>
<p>For one thing, there were a lot of unresolved questions about just
what features this global executor should support. But for another,
the main goal here is to enable libraries to write &ldquo;executor
independent&rdquo; code, but it&rsquo;s not clear how many libraries spawn tasks
<strong>anyway</strong> &ndash; that&rsquo;s usually done more at the application
level. Libraries tend to instead return a future and let the
application do the spawning (interestingly, one place this doesn&rsquo;t
work is in destructors, since they can&rsquo;t return futures; supporting
async drop, as discussed earlier, would help here.)</p>
<p>So it&rsquo;d probably be better to revisit this question once we have more
experience, particularly once we have the async I/O and stream traits
available.</p>
<h2 id="the-futures-crate">The futures crate</h2>
<p>We discussed other possible additions to the standard library.
There are a lot of &ldquo;building blocks&rdquo; currently in the futures library
that are independent from executors and which could do well in the standard
library. Some of the things that we talked about:</p>
<ul>
<li>async-aware mutexes, clearly a useful building block</li>
<li>channels
<ul>
<li>though std channels are not the most loved, crossbeam&rsquo;s are genreally preferred</li>
<li>interstingly, channel types <em>do</em> show up in public APIs from time to time, as a way to receive data, so having them in std could be particularly useful</li>
</ul>
</li>
</ul>
<p>In general, where things get more complex is whenever you have bits of
code that either have to spawn tasks or which do the &ldquo;core I/O&rdquo;. These
are the points where you need a more full-fledged reactor or
runtime. But there are lots of utilities that don&rsquo;t need that and
which could profitably level in the std library.</p>
<h2 id="where-to-put-async-things-in-the-stdlib">Where to put async things in the stdlib?</h2>
<p>One theme that boats and I did not discuss, but which has come up when
I&rsquo;ve raised this question with others, is <em>where</em> to put async-aware
traits in the std hierarchy, particularly when there are sync
versions. For example, should we have <code>std::io::Read</code> and
<code>std::io::AsyncRead</code>? Or would it be better to have <code>std::io::Read</code>
and something like <code>std::async::io::Read</code> (obviously, async is a
keyword, so this precise path may not be an option). In other words,
should we combine sync/async traits into the same space, but with
different names, or should we carve out a space for &ldquo;async-enabled&rdquo;
traits and use the same names? An interesting question, and I don&rsquo;t
have an opinion yet.</p>
<h2 id="conclusion-and-some-of-my-thoughts">Conclusion and some of my thoughts</h2>
<p>I always enjoy talking with boats, and this time was no exception.  I
think boats raised a number of small, practical ideas that hadn&rsquo;t come
up before. I do think it&rsquo;s important that, in addition to stabilizing
fundamental building blocks like <code>AsyncRead</code>, we also consider
improvements to the ergonomic experience with smaller changes like
<code>async fn main</code>, and I agree with the guiding principle that boats
raised of keeping async and sync code as &ldquo;analogous&rdquo; as possible.</p>
<h3 id="comments">Comments?</h3>
<p>There is a <a href="https://users.rust-lang.org/t/async-interviews/35167/">thread on the Rust users forum</a> for this series.</p>
<h3 id="footnotes">Footnotes</h3>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>In the compiler, we call these &ldquo;suspendable functions&rdquo; generators, but I&rsquo;m avoiding that terminology for a reason.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>This is why I was avoiding using the term &ldquo;generator&rdquo; earlier &ndash; I want to say &ldquo;suspendable functions&rdquo; when referring to the implementation mechanism, and &ldquo;generator&rdquo; when referring to the user-exposed feature.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>though not one that a fully general mechanism necessarily
precludes&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/asyncinterviews" term="asyncinterviews" label="AsyncInterviews"/></entry><entry><title type="html">Async Interview #6: Eliza Weisman</title><link href="https://smallcultfollowing.com/babysteps/blog/2020/02/11/async-interview-6-eliza-weisman/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2020/02/11/async-interview-6-eliza-weisman/</id><published>2020-02-11T00:00:00+00:00</published><updated>2020-02-11T00:00:00+00:00</updated><content type="html"><![CDATA[<p>Hello! For the latest <a href="http://smallcultfollowing.com/babysteps/blog/2019/11/22/announcing-the-async-interviews/">async interview</a>, I spoke with Eliza Weisman
(<a href="https://github.com/hawkw/">hawkw</a>, <a href="https://twitter.com/mycoliza">mycoliza on twitter</a>). Eliza first came to my attention as the author of the
<a href="https://crates.io/crates/tracing">tracing</a> crate, which is a nifty crate for doing application level
tracing. However, she is also a core maintainer of tokio, and she
works at Buoyant on the <a href="https://linkerd.io/">linkerd</a> system. <a href="https://linkerd.io/">linkerd</a> is one of a small
set of large applications that were build using 0.1 futures &ndash; i.e.,
before async-await. This range of experience gives Eliza an interesting
&ldquo;overview&rdquo; perspective on async-await and Rust more generally.</p>
<h3 id="video">Video</h3>
<p>You can watch the <a href="https://youtu.be/bCf9K28TqVQ">video</a> on YouTube. I&rsquo;ve also embedded a copy here
for your convenience:</p>
<center><iframe width="560" height="315" src="https://www.youtube.com/embed/bCf9K28TqVQ" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe></center>
<h3 id="the-days-before-question-mark">The days before question mark</h3>
<p>Since I didn&rsquo;t know Eliza as well, we started out talking a bit about
her background. She has been using Rust for 5 years, and I was amused
by how she characterized the state of Rust when she got started:
pre-&ldquo;question mark&rdquo; Rust. Indeed, the introduction of the <code>?</code> operator
does feel one of those &ldquo;turning points&rdquo; in the history of Rust, and
I&rsquo;m quite sure that <code>async</code>-<code>await</code> will feel similarly (at least for
some applications).</p>
<p>One interesting observation that Eliza made is that it feels like Rust
has reached the point where there is nothing <em>critically missing</em>.
This isn&rsquo;t to say there aren&rsquo;t things that need to be improved, but
that the number of &ldquo;rough edges&rdquo; has dramatically decreased. I think
this is true, and we should be proud of it &ndash; though we also shouldn&rsquo;t
relax too much. =) Getting to learn Rust is still a significant hurdle
and there are still a number of things that are much harder than they
need to be.</p>
<p>One interesting corrolary of this is that a number of the things that
most affect Eliza when writing Async I/O code are <strong>not specific to
async I/O</strong>.  Rather, they are more general features or requirements
that apply to a lot of different things.</p>
<h3 id="tokios-needs">Tokio&rsquo;s needs</h3>
<p>We talked some about what <a href="https://tokio.rs/">tokio</a> needs from async Rust. As Eliza
said, many of the main points already came up in <a href="http://smallcultfollowing.com/babysteps/blog/2019/12/23/async-interview-3-carl-lerche/">my conversation with
Carl</a>:</p>
<ul>
<li>async functions in traits would be great, but <a href="http://smallcultfollowing.com/babysteps/blog/2019/10/26/async-fn-in-traits-are-hard/">they&rsquo;re hard</a></li>
<li>stabilizing streams, async read, and async write would be great</li>
</ul>
<h3 id="communicating-stability">Communicating stability</h3>
<p>One thing we spent a fair while discusing is how to best
<strong>communicate</strong> our stability story. This goes beyond &ldquo;semver&rdquo;.
semver tells you <em>when a breaking change has been made</em>, of course,
but it doesn&rsquo;t tell <em>whether a breaking change will be made in the
future</em> &ndash; or how long we plan to do backports, and the like.</p>
<p>The easiest way for us to communicate stability is to move things
to the std library. That is a clear signal that breaking changes
will <strong>never</strong> be made.</p>
<p>But there is room for us to set &ldquo;intermediate&rdquo; levels of stability.
One thing that might help is to make a <strong>public stability policy</strong> for
crates like <code>futures</code>. For example, we could declare that the futures
crate will maintain compatibility with the current <code>Stream</code> crate for
the next year, or two ears.</p>
<p>These kind of timelines would be helpful: for example, tokio plans to
<a href="https://tokio.rs/blog/2019-11-tokio-0-2/#tokio-1-0-in-q3-2020-with-lts-support">maintain a stable interface for the next 5 years</a>, and so if
they want to expose traits from the <code>futures</code> crate, they would want a
guarantee that those traits would be supported during that period (and
ideally that futures would not release a semver-incompatible version
of those traits).</p>
<h3 id="depending-on-community-crates">Depending on community crates</h3>
<p>When we talk about interoperability, we are often talking about core
traits like <code>Future</code>, <code>Stream</code>, and <code>AsyncRead</code>. But as we move up the
stack, there are other things where having a defined standard could be
really useful. My go to example for this is the <a href="https://crates.io/crates/http">http</a> crate, which
defines a number of types for things like HTTP error codes. The types
are important because they are likely to find their way in the &ldquo;public
interface&rdquo; of libraries like hyper, as well as frameworks and things.
I would like to see a world where web frameworks can easily be
converted between frameworks or across HTTP implementations, but that
would be made easier if there is an agreed upon standard for
representing the details of a HTTP request. Maybe the <a href="https://crates.io/crates/http">http</a> crate is
that already, or can become that &ndash; in any case, I&rsquo;m not sure if the
stdlib is the right place for such a thing, or at least not for some
time. It&rsquo;s something to think about. (I do suspect that it might be
useful to move such crates to the Rust org? But we&rsquo;d have to have a
good story around maintainance.) Anyway, I&rsquo;m getting beyond what was
in the interview I think.</p>
<h3 id="tracing">Tracing</h3>
<p>We talked a fair amount about the <a href="https://crates.io/crates/tracing">tracing</a> library. Tracing is one of
those libraries that can do a large number of things, so it&rsquo;s kind of
hard to concisely summarize what it does. In short, it is a set of
crates for collecting <em>scoped, structured, and contextual diagnostic
information</em> in Rust programs. One of the simplest use cases is to
collect logging information, but it can also be used for things like
profiling and any number of other tasks.</p>
<p>I myself started to become interesting in tracing as a possible tool
to help for debugging and analyzing programs like rustc and chalk,
where the &ldquo;chain&rdquo; that leads to a bug can often be quite complex and
involve numerous parts of the compiler. Right now I tend to just dump
gigabytes of logs into files and traverse them with grep. In so doing,
I lose all kinds of information (like hierarchical information about
what happens during what) that would make my life easier. I&rsquo;d love a
tool that let me, for example, track &ldquo;all the logs that pertain to a
particular function&rdquo; while also making it easy to find the context in
which a particular log occurred.</p>
<p>The <a href="https://crates.io/crates/tracing">tracing</a> library got its start as a structured replacement for
various hacky layers atop the <code>log</code> crate that were in use for
debugging <a href="https://linkerd.io/">linkerd</a>. Like many async applications, debugging a
<a href="https://linkerd.io/">linkerd</a> session involves correlating a lot of events that may be
taking place at distinct times &ndash; or even distinct <em>machines</em> &ndash; but
are still part of one conceptual &ldquo;thread&rdquo; of control.</p>
<p><a href="https://crates.io/crates/tracing">tracing</a> is actually a &ldquo;front-end&rdquo; built atop the &ldquo;tracing-core&rdquo;
crate. tracing-core is a minimal crate that just stores a thread-local
containing the current &ldquo;event subscriber&rdquo; (which processes the tracing
events in some way). You don&rsquo;t interact with tracing-core directly,
but it&rsquo;s important to the overall design, as we&rsquo;ll see in a bit.</p>
<p>The tracing front-end contains a bunch of macros, rather like the
<code>debug!</code> and <code>info!</code> you may be used to from the log crate (and indeed
there are crates that let you use those <code>debug!</code> logs directly).  The
major one is the <code>span!</code> macro, which lets you declare that a task is
happening.  It works by putting a &ldquo;placeholder&rdquo; on the stack: when
that placeholder is dropped, the task is done:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">s</span>: <span class="nc">Span</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">span!</span><span class="p">(</span><span class="o">..</span><span class="p">.);</span><span class="w"> </span><span class="c1">// create a span `s`
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">_guard</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">s</span><span class="p">.</span><span class="n">enter</span><span class="p">();</span><span class="w"> </span><span class="c1">// enter `s`, so that subsequent events take place &#34;in&#34; `s`
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">t</span>: <span class="nc">Span</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">span!</span><span class="p">(</span><span class="o">..</span><span class="p">.);</span><span class="w"> </span><span class="c1">// create a *subspan* of `s` called `t`
</span></span></span><span class="line"><span class="cl"><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span></code></pre></div><p>Under the hood, all of these macros forward to the &ldquo;subscripber&rdquo; we
were talking about later. So they might receive events like &ldquo;we
entered this span&rdquo; or &ldquo;this log was generated&rdquo;.</p>
<p>The idea is that events that happen inside of a span inherit the
context of that span. So, to jump back to my compiler example, I might
use a span to indicate which function is currently being type-checked,
which would then be associated with any events that took place.</p>
<p>There are many different possible kinds of subscribers. A subscriber
might, for example, dump things out in real time, or it might just
collectevents and log them later.  Crates like <a href="https://crates.io/crates/tracing-timing">tracing-timing</a> record
inter-event timing and make histograms and flamegraphs.</p>
<h3 id="integrating-tracing-with-other-libraries">Integrating tracing with other libraries</h3>
<p>It seems clear that tracing would work best if it is integrated with
other libaries. I believe it is already integrated into tokio, but one
could also imagine integrating tracing with rayon, which distributes
tasks across worker threads to run in parallel. The goal there would
be that we &ldquo;link&rdquo; the tasks so that events which occur in a parallel
task inherit the context/span information from the task which spawned
them, even though they&rsquo;re running on another thread.</p>
<p>The idea here is not only that Rayon can link up your application
events, but that Rayon can add its own debugging information using
tracing in a non-obtrusive way. In the &lsquo;bad old days&rsquo;, tokio used to
have a bunch of <code>debug!</code> logs that would let you monitor what was
going on &ndash; but these logs were often confusing and really targeting
internal tokio developers.</p>
<p>With the tracing crate, the goal is that libraries can <em>enrich</em> the
user&rsquo;s diagnostics. For example, the hyper library might add metadata
about the set of headers in a request, and tokio might add information
about which thread-pool is in use. This information is all &ldquo;attached&rdquo;
to your actual application logs, which have to do with your business
logic. Ideally, you can ignore them most of the time, but if that sort
of data becomes relevant &ndash; e.g., maybe you are confused about why a
header doesn&rsquo;t seem to be being detected by your appserver &ndash; you can
dig in and get the full details.</p>
<h3 id="integrating-tracing-with-other-logging-systems">Integrating tracing with other logging systems</h3>
<p>Eliza emphasized that she would really like to see more
interoperability amongst tracing libraries. The current tracing crate,
for example, can be easily made to emit log records, making it
interoperable with the <a href="https://crates.io/crates/log">log</a> crate (there is also a &ldquo;logger&rdquo; that
implements the tracing interface).</p>
<p>Having a distinct tracing-core crate means that it possible for there
to be multiple facades that build on tracing, potentially operating in
quite different ways, which all share the same underlying &ldquo;subscriber&rdquo;
infrastructure. (rayon uses the same trick; the <a href="https://crates.io/crates/rayon-core">rayon-core</a> crate
defines the underlying scheduler, so that multiple versions of the
rayon <code>ParallelIterator</code> traits can co-exist without having multiple
global schedulers.) Eliza mentioned that &ndash; in her ideal world &ndash;
there&rsquo;d be some alternative front-end that is so good it can replaces
the <code>tracing</code> crate altogether, so she no longer has to maintain the
macros. =)</p>
<h3 id="raii-and-async-fn-doesnt-always-play-well">RAII and async fn doesn&rsquo;t always play well</h3>
<p>There is one feature request for async-await that arises from the
tracing library. I mentioned that tracing uses a guard to track the
&ldquo;current span&rdquo;:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">s</span>: <span class="nc">Span</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">span!</span><span class="p">(</span><span class="o">..</span><span class="p">.);</span><span class="w"> </span><span class="c1">// create a span `s`
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">_guard</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">s</span><span class="p">.</span><span class="n">enter</span><span class="p">();</span><span class="w"> </span><span class="c1">// enter `s`, so that subsequent events take place &#34;in&#34; `s`
</span></span></span><span class="line"><span class="cl"><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span></code></pre></div><p>The way this works is that the guard returned by <code>s.enter()</code> adds some
info into the thread-local state and, when it is dropped, that info is
withdrawn. Any logs that occur while the <code>_guard</code> is still live are
then decorated with this extra span information. <strong>The problem is that
this mechanism doesn&rsquo;t work with async-await.</strong></p>
<p>As <a href="https://github.com/tokio-rs/tracing#in-asynchronous-code">explained in the tracing README</a>, the problem is that if an
async await function yields during an <code>await</code>, then it is removed from
the current thread and suspended. It will later be resumed, but
potentially on another thread altogether. However, the <code>_guard</code>
variable is not notified of these events, so (a) the thread-local info
remains set on the original thread, where it may not longer belong and
(b) the destructor which goes to remove the info will run on the wrong
thread.</p>
<p>One way to solve this would be to have some sort of callback that
<code>_guard</code> can receive to indicate that it is being yielded, along with
another callback for when an async fn resumes. This would probably
wind up being optional methods of the <code>Drop</code> trait. This is basically
another feature request to making RAII work well in an async
environment (in addition to the <a href="https://boats.gitlab.io/blog/post/poll-drop/">existing problems with async drop that boats
described here</a>).</p>
<h3 id="priorities-as-a-linkerd-hacker">Priorities as a linkerd hacker</h3>
<p>I asked Eliza to think for a second about what priorities she would
set for the Rust org while wearing her &ldquo;linkerd hacker&rdquo; hat &ndash; in
other words, when acting not as a library designer, but as the author
of an that relies on Async I/O. Most of the feedback here though had
more to do with general Rust features than async-await specifically.</p>
<p>Eliza pointed out that linkerd hasn&rsquo;t yet fully upgraded to use
async-await, and that the vast majority of pain points she&rsquo;s
encountered thus far stem from having to use the older futures model,
which <a href="http://aturon.github.io/tech/2018/04/24/async-borrowing/">didn&rsquo;t integrate well with rust borrows</a>.</p>
<p>The other main pain point is the compilation time costs imposes by the
deep trait hierarchies created by tower&rsquo;s service and layer
traits. She mentioned hitting a type error that was so long it
actually crashed her terminal. I&rsquo;ve heard of others hitting similar
problems with this sort of setup. I&rsquo;m not sure yet how this is best
addressed.</p>
<p>Another major feature request would be to put more work into
procedural macros, especially in expression position. Right now
<code>proc-macro-hack</code> is the tool of choice but &ndash; as the name suggests &ndash;
it doesn&rsquo;t seem ideal.</p>
<p>The other major point is that support for cargo feature flags in
tooling is pretty minimal. It&rsquo;s very easy to have code with feature
flags that &ldquo;accidentally&rdquo; works &ndash; i.e., I depend on feature flag X,
but I don&rsquo;t specify it; it just gets enabled via some other dependency
of mine. This also makes testing of feature flags hard. rustdoc
integration could be better. All true, all challenging. =)</p>
<h3 id="comments">Comments?</h3>
<p>There is a <a href="https://users.rust-lang.org/t/async-interviews/35167/">thread on the Rust users forum</a> for this series.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/asyncinterviews" term="asyncinterviews" label="AsyncInterviews"/></entry><entry><title type="html">Async Interview #5: Steven Fackler</title><link href="https://smallcultfollowing.com/babysteps/blog/2020/01/20/async-interview-5-steven-fackler/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2020/01/20/async-interview-5-steven-fackler/</id><published>2020-01-20T00:00:00+00:00</published><updated>2020-01-20T00:00:00+00:00</updated><content type="html"><![CDATA[<p>Hello! For the latest <a href="http://smallcultfollowing.com/babysteps/blog/2019/11/22/announcing-the-async-interviews/">async interview</a>, I spoke with Steven Fackler
(<a href="https://github.com/sfackler/">sfackler</a>). sfackler has been involved in Rust for a long time and
is a member of the Rust libs team. He is also the author of <a href="https://crates.io/users/sfackler">a lot of
crates</a>, most notably <a href="https://crates.io/crates/tokio-postgres">tokio-postgres</a>.</p>
<p>I particularly wanted to talk to sfackler about the <code>AsyncRead</code> and
<code>AsyncWrite</code> traits. These traits are on everybody&rsquo;s list of
&ldquo;important things to stabilize&rdquo;, particularly if we want to create
more interop between different executors and runtimes. On the other
hand, in [tokio-rs/tokio#1744], the tokio project is considering
adopting its own variant traits that diverge significantly from those
in the futures crate, precisely because they have concerns over the
design of the traits as is. This seems like an important area to dig
into!</p>
<h3 id="video">Video</h3>
<p>You can watch the <a href="https://youtu.be/nerrc3L9qrM">video</a> on YouTube. I&rsquo;ve also embedded a copy here
for your convenience:</p>
<center><iframe width="560" height="315" src="https://www.youtube.com/embed/nerrc3L9qrM" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe></center>
<p>One note: something about our setup meant that I was hearing a lot of
echo. I think you can sometimes hear it in the recording, but not
nearly as bad as it was live. So if I seem a bit spacey, or take very
long pauses, you might know the reason why!</p>
<h2 id="background-concerns-on-the-async-read-trait">Background: concerns on the async-read trait</h2>
<p>So what are the concerns that are motivating <a href="https://github.com/tokio-rs/tokio/pull/1744">tokio-rs/tokio#17144</a>?
There are two of them:</p>
<ul>
<li>the current traits do not permit using uninitialized memory as the
backing buffer;</li>
<li>there is no way to test presently whether a given reader supports
vectorized operations.</li>
</ul>
<h2 id="this-blog-post-will-focus-on-uninitialized-memory">This blog post will focus on uninitialized memory</h2>
<p>sfackler and I spent most of our time talking about uninitialized
memory. We did also discuss vectorized writes, and I&rsquo;ll include some
notes on that at the end, but by and large sfackler felt that the
solutions there are much more straightforward.</p>
<h2 id="important-the-same-issues-arise-with-the-sync-read-trait">Important: The same issues arise with the sync <code>Read</code> trait</h2>
<p>Interestingly, neither of these issues is specific to <code>AsyncRead</code>.  As
defined today, the <code>AsyncRead</code> trait is basically just the async
version of <a href="https://doc.rust-lang.org/std/io/trait.Read.html"><code>Read</code></a> from <code>std</code>, and both of these concerns apply there
as well. In fact, part of why I wanted to talk to sfackler
specifically is that he is the author of an <a href="https://paper.dropbox.com/doc/MvytTgjIOTNpJAS6Mvw38">excellent paper
document</a> that covers the problem of using uninitialized memory
in great depth. A lot of what we talked about on this call is also
present in that document.  Definitely give it a read.</p>
<h2 id="read-interface-doesnt-support-uninitialized-memory">Read interface doesn&rsquo;t support uninitialized memory</h2>
<p>The heart of the <code>Read</code> trait is the <code>read</code> method:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">read</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">buf</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="p">[</span><span class="kt">u8</span><span class="p">])</span><span class="w"> </span>-&gt; <span class="nc">io</span>::<span class="nb">Result</span><span class="o">&lt;</span><span class="kt">usize</span><span class="o">&gt;</span><span class="w">
</span></span></span></code></pre></div><p>This method reads data and writes it into <code>buf</code> and then &ndash; assuming
no error &ndash; returns <code>Ok(n)</code> with the number <code>n</code> of bytes written.</p>
<p>Ideally, we would like it if <code>buf</code> could be an uninitialized buffer.
After all, the <code>Read</code> trait is not supposed to be <em>reading</em> from
<code>buf</code>, it&rsquo;s just supposed to be <em>writing</em> into it &ndash; so it shouldn&rsquo;t
matter what data is in there.</p>
<h2 id="problem-1-the-impl-might-read-from-the-buf-even-if-it-shouldnt">Problem 1: The impl might read from the buf, even if it shouldn&rsquo;t</h2>
<p>However, in practice, there are two problems with using uninitialized
memory for <code>buf</code>. The first one is relatively obvious: although it
isn&rsquo;t <em>supposed</em> to, the <code>Read</code> impl can trivially read from <code>buf</code> without
using any unsafe code:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Read</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">MyReader</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">read</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">buf</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="p">[</span><span class="kt">u8</span><span class="p">])</span><span class="w"> </span>-&gt; <span class="nc">io</span>::<span class="nb">Result</span><span class="o">&lt;</span><span class="kt">usize</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">buf</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Reading from an uninitialized buffer is Undefined Behavior and could
cause crashes, segfaults, or worse.</p>
<h2 id="problem-2-the-impl-might-not-really-initialize-the-buffer">Problem 2: The impl might not really initialize the buffer</h2>
<p>There is also a second problem that is often overlooked: when the
<code>Read</code> impl returns, it returns a value <code>n</code> indicating how many bytes
of the buffer were written. In principle, if <code>buf</code> was uninitialized
to start, then the first <code>n</code> bytes should be written now &ndash; but <em>are</em>
they? Consider a <code>Read</code> impl like this one:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Read</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">MyReader</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">read</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">buf</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="p">[</span><span class="kt">u8</span><span class="p">])</span><span class="w"> </span>-&gt; <span class="nc">io</span>::<span class="nb">Result</span><span class="o">&lt;</span><span class="kt">usize</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nb">Ok</span><span class="p">(</span><span class="n">buf</span><span class="p">.</span><span class="n">len</span><span class="p">())</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This impl has no unsafe code. It <em>claims</em> that it has initialized the
entire buffer, but it hasn&rsquo;t done any writes into <code>buf</code> at all! Now if
the caller tries to read from <code>buf</code>, it will be reading uninitialized
memory, and causing UB.</p>
<p>One subtle point here. The problem isn&rsquo;t that the read impl could
return a false value about how many bytes it has written. <strong>The
problem is that it can lie without ever using any unsafe code at
all.</strong> So if you are auditing your code for unsafe blocks, you would
overlook this.</p>
<h2 id="constraints-and-solutions">Constraints and solutions</h2>
<p>There have been a lot of solutions proposed to this problem. sfackler
and I talked about all of them, I think, but I&rsquo;m going to skip over
most of the details. You can find them either in the video or in <a href="https://paper.dropbox.com/doc/MvytTgjIOTNpJAS6Mvw38">in
sfackler&rsquo;s paper document</a>, which covers much of the same
material.</p>
<p>In this post, I&rsquo;ll just cover what we said about three of the options:</p>
<ul>
<li>First, adding a <code>freeze</code> operation.
<ul>
<li>This is in some ways the simplest, as it requires no change to
<code>Read</code> at all.</li>
<li>Unfortunately, it has a number of limitations and downsides.</li>
</ul>
</li>
<li>Second, adding a second <code>read</code> method that takes a <code>&amp;mut dyn BufMut</code> dyn value.
<ul>
<li>This is the solution initially proposed in [tokio-rs/tokio#1744].</li>
<li>It has much to recommend it, but requires virtual calls in a core API, although
initial benchmarks suggest such calls are not a performance problem.</li>
</ul>
</li>
<li>Finally, creating a struct <code>BufMuf</code> in the stdlib for dealing with partially initialized
buffers, and adding a <code>read</code> method for <em>that</em>.
<ul>
<li>This overcomes some of the downsides of using a trait, but at the
cost of flexibility.</li>
</ul>
</li>
</ul>
<h2 id="digression-how-to-think-about-uninitialized-memory">Digression: how to think about uninitialized memory</h2>
<p>Before we go further, let me digress a bit. I think the common
understanding of uninitialized memory is that &ldquo;it contains whatever
values happen to be in there at the moment&rdquo;. In other words, you might
imagine that when you first allocate some memory, it contains <em>some</em>
value &ndash; but you can&rsquo;t predict what that is.</p>
<p>This intuition turns out to be incorrect. This is true for a number of
reasons. Compiler optimizations are part of it. In LLVM, for example,
an uninitialized variable is not assigned to a fixed stack slot or
anything like that. It is instead a kind of &ldquo;free floating&rdquo;
&ldquo;uninitialized&rdquo; value, and &ndash; whenever needed &ndash; it is mapped to
whatever register or stack slot happens to be convenient at the time
for most optimal code. What this means in practice is that each time
you try to read from it, the compiler will substitute <em>some</em> value,
but it won&rsquo;t necessarily be the <em>same</em> value every time. This behavior
is justified by the C standard, which states that reading
uninitialized memory is &ldquo;undefined behavior&rdquo;.</p>
<p>This can cause code to go quite awry. The canonical example in my mind
is the case of a bounds check. You might imagine, for example, that code
like this would suffice for legally accessing an array:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">index</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">compute_index</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">if</span><span class="w"> </span><span class="n">index</span><span class="w"> </span><span class="o">&lt;</span><span class="w"> </span><span class="n">length</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">return</span><span class="w"> </span><span class="o">&amp;</span><span class="n">array</span><span class="p">[</span><span class="n">index</span><span class="p">];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="fm">panic!</span><span class="p">(</span><span class="s">&#34;out of bounds&#34;</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>However, if the value returned by <code>compute_index</code> is uninitialized,
this is incorrect. Because in that case, <code>index</code> will also be &ldquo;the
uninitialized value&rdquo;, and hence each access to it conceptually yields
different values.  So the value that we compare against <code>length</code> might
not be the same value that we use to index into the array one line
later. Woah.</p>
<p>But, as sfackler and I discussed, there are actually other layers that
rely on uninitialized memory never being read even below the
kernel. For example, in the linux kernel, the virtual memory system
has a flag called <code>MADV_FREE</code>. This flag is used to mark virtual
memory pages that are considered uninitialized.  For each such virtual
page, khe kernel is free to change the physical memory page at will &ndash;
<em>until</em> the virtual page is written to. At that point, the memory is
potentially initialized, and so the virtual page is pinned. What this
means in practice is that when you get memory back from your
allocator, each read from that memory may yield different values,
unless you&rsquo;ve written to it first.</p>
<p>For all these reasons, it is best to think of uninitialized memory not
as having &ldquo;some random value&rdquo; but rather as having the value
&ldquo;uninitialized&rdquo;.  This is special value that can, sometimes, be
converted to a random value when it is forced to (but, if accessed
multiple times, it may yield different values each time).</p>
<p>If you&rsquo;d like a deeper treatment, I recommend <a href="https://www.ralfj.de/blog/2019/07/14/uninit.html">Ralf&rsquo;s blog post</a>.</p>
<h2 id="possible-solution-to-read-freeze-operation">Possible solution to read: Freeze operation</h2>
<p>So, given the above, what is the freeze operation, and how could it
help with handling uninitialized memory in the <code>read</code> API?</p>
<p>The general idea is that we could have a primitive called <code>freeze</code>
that, given some (potentially) uninitialized value, converts any
uninititalized bits into &ldquo;some random value&rdquo;. We could use this to
fix our indexing, for example, by &ldquo;freezing&rdquo; the index before we compare
against the length:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">index</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">freeze</span><span class="p">(</span><span class="n">compute_index</span><span class="p">());</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">if</span><span class="w"> </span><span class="n">index</span><span class="w"> </span><span class="o">&lt;</span><span class="w"> </span><span class="n">length</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">return</span><span class="w"> </span><span class="o">&amp;</span><span class="n">array</span><span class="p">[</span><span class="n">index</span><span class="p">];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="fm">panic!</span><span class="p">(</span><span class="s">&#34;out of bounds&#34;</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>In a similar way, if we have a reference to an uninitialized buffer,
we could conceivably &ldquo;freeze&rdquo; that reference to convert it to a reference
of random bytes, and then we can safely use that to invoke <code>read</code>.
The idea would be that callers do something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">uninitialized_buffer</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">..</span><span class="p">.;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">buffer</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">freeze</span><span class="p">(</span><span class="n">uninitialized_buffer</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">n</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">reader</span><span class="p">.</span><span class="n">read</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">buffer</span><span class="p">)</span><span class="o">?</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span></code></pre></div><p>If we could do this, it would be great, because the existing <code>read</code>
interface wouldn&rsquo;t have to change at all!</p>
<p>There are a few complications though. First off, there is no such
<code>freeze</code> operation in LLVM today. There is talk of adding one, but
that operation wouldn&rsquo;t quite do what we need. For one thing, it
freezes the value it is applied to, but it doesn&rsquo;t apply through a
reference. So you could use it to fix our array bounds length checking
example, but you can&rsquo;t use it to fix <code>read</code> &ndash; we don&rsquo;t need to freeze
the <code>&amp;mut [u8]</code> <em>reference</em>, we need to fix the memory it <em>refers to</em>.</p>
<p>Secondly, that primitive would only apply to compiler optimizations.
It wouldn&rsquo;t protect against kernel optimizations like <code>MADV_FREE</code>. To
handle that, we have to do something extra, such as writing one byte
per memory page. That&rsquo;s conceivable, of course, but there are some
downsides:</p>
<ul>
<li>It feels fragile. What if linux adds some new optimizations in the
future, how will we work around those?</li>
<li>It feels disappointing. After all, <code>MADV_FREE</code> was presumably added
because it allows this to be faster &ndash; and we all agree that given a
&ldquo;well-behaved&rdquo; <code>Read</code> implementation, it should be reasonable.</li>
<li>It can be expensive. sfackler pointed out that it is sometimes
common to &ldquo;over-provision&rdquo; your read buffers, such as creating a
16MB buffer, so as to avoid blocking. This is fairly cheap in
practice, but only thanks to optimizations (like <code>MADV_FREE</code>) that
allow that memory to be lazilly allocated and so forth. If we start
writing a byte into every page of a 16MB buffer, you&rsquo;re going to
notice the difference.</li>
</ul>
<p>For these reasons, sfackler felt like <code>freeze</code> isn&rsquo;t the right answer
here. It might be a useful primitive for things like array bounds
checking, but it would be better if we could modify the <code>Read</code> trait
in such a way that we permit the use of &ldquo;unfrozen&rdquo; uninitialized
memory.</p>
<p>Incidentally, this is a topic we&rsquo;ve hit on in previous async
interviews.  [cramertj and I talked about it][ctj2], for example. My
own opinion has shifted &ndash; at first, I thought a freeze primitive was
obviously a good idea, but I&rsquo;ve come to agree with sfackler that it&rsquo;s
not the right solution here.</p>
<h2 id="fallback-and-efficient-interoperability">Fallback and efficient interoperability</h2>
<p>If we don&rsquo;t take the approach of adding a <code>freeze</code> primitive, then
this implies that we are going to have to extend the <code>Read</code> trait with
some of second method. Let&rsquo;s call it <code>read2</code> for short. And this
raises an interesting question: how are we going to handle backwards
compatibility?</p>
<p>In particular, <code>read2</code> is going to have a default, so that existing
impls of <code>Read</code> are not invalidated. And this default is going to have
to fallback to calling <code>read</code>, since that is the only method that we
can guarantee to exist. Since <code>read</code> requires a fully initialized
buffer, this will mean that <code>read2</code> will have to zero its buffer if it
may be uninitialized. This by itself is ok &ndash; it&rsquo;s no worse than today.</p>
<p>The problem is that some of the solutions discussed in <a href="https://paper.dropbox.com/doc/MvytTgjIOTNpJAS6Mvw38">sfackler&rsquo;s
doc</a> can wind up having to zero the buffer multiple times,
depending on how things play out. And this could be a big performance
cost. That is definitely to be avoided.</p>
<h2 id="possible-solution-to-read-take-a-trait-object-and-not-a-buffer">Possible solution to read: Take a trait object, and not a buffer</h2>
<p>Another proposed solution, in fact the one described in [tokio-rs/tokio#1744],
is to modify <code>read</code> so it takes a trait object (in the case of the <code>Read</code> trait,
we&rsquo;d have to add a new, defaulted method):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">read_buf</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">buf</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="k">dyn</span><span class="w"> </span><span class="n">BufMut</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">io</span>::<span class="nb">Result</span><span class="o">&lt;</span><span class="p">()</span><span class="o">&gt;</span><span class="w">
</span></span></span></code></pre></div><p>The idea here is that <code>BufMut</code> is a trait that lets you safely
access a potentially uninitialized set of buffers:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">trait</span><span class="w"> </span><span class="n">BufMut</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">remaining_mut</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">usize</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">unsafe</span><span class="w"> </span><span class="k">fn</span> <span class="nf">advance_mut</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">cnt</span>: <span class="kt">usize</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">unsafe</span><span class="w"> </span><span class="k">fn</span> <span class="nf">bytes_mut</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="p">[</span><span class="kt">u8</span><span class="p">];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>You might wonder why the definition takes a <code>&amp;mut dyn BufMut</code>, rather
than a <code>&amp;mut impl BufMut</code>. Taking <code>impl BufMut</code> would mean that the
code is specialized to the particular sort of buffer you are using, so
that would potentially be quite a bit faster. However, it would also
make <code>Read</code> not &ldquo;dyn-safe&rdquo;<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>, and that&rsquo;s a non-starter.</p>
<p>There are some nifty aspects to this proposal. One of them is that the
same trait can to some extent &ldquo;paper over&rdquo; vectorized writes, by
distributing the data written across buffers in a chain.</p>
<p>But there are some downsides. Perhaps most important is that requiring
virtual calls to write into the buffer could be a significant
performance hazard. Thus far, measurements don&rsquo;t suggest that, but it
seems like a cost that can only be recovered by heroic compiler
optimizations, and that&rsquo;s the kind of thing we prefer to avoid.</p>
<p>Moreover, the ability to be generic over vectorized writes may not be
as useful as you might think. Often, the caller wants to know whether
the underlying <code>Read</code> supports vectorized writes, and it would operate
quite differently in that case. Therefore, it doesn&rsquo;t really hurt to
have two <code>read</code> methods, one for normal and one for vectorized writes.</p>
<h2 id="variant-use-a-struct-instead-of-a-trait">Variant: use a struct, instead of a trait</h2>
<p>The variant that sfackler prefers is to replace the <code>BufMut</code> trait
with a struct.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> The API of this struct would be fairly similar
to the trait above, except that it wouldn&rsquo;t make much attempt to unify
vectorized and non-vectorized writes.</p>
<p>Basically, we&rsquo;d have a struct that encapsulates a &ldquo;partially
initialized slice of bytes&rdquo;. You could create such a struct from a
standard slice, in which case all things are initialized, or you can
create it from a slice of &ldquo;maybe initialized&rdquo; bytes (e.g., <code>&amp;mut [MaybeUninit&lt;u8&gt;]</code>. There can also be convenience methods to create a
<code>BufMut</code> that refers to the uninitialized tail of bytes from a <code>Vec</code>
(i.e., pointing into the vector&rsquo;s internal buffer).</p>
<p>The safe methods of the <code>BufMut</code> API would permit</p>
<ul>
<li>writing to the buffer, which will track the bytes that were initialized;</li>
<li>getting access to a slice, but only one that is guaranteed to be initialized.</li>
</ul>
<p>There would be unsafe methods for getting access to memory that may be
uninitialized, or for asserting that you have initialized a big swath
of bytes (e.g., by handing the buffer off to the kernel to get written
to).</p>
<p>The buffer has state: it can track what has been initialized. This
means that any given part of the buffer will get zeroed at most
once. This ensures that fallback from the new <code>read2</code> method to the
old <code>read</code> method is reasonably efficient.</p>
<h2 id="sync-vs-async-how-to-proceed">Sync vs async, how to proceed</h2>
<p>So, given the above thoughts, how should we proceed with <code>AsyncRead</code>?
sfackler felt that the question of how to handle uninitialized output
buffers was basically &ldquo;orthogonal&rdquo; from the question of whether and
when to add <code>AsyncRead</code>. In others, sfackler felt that the <code>AsyncRead</code>
and <code>Read</code> traits should mirror one another, which means that we could
add <code>AsyncRead</code> now, and then add a solution for uninitialized memory
later &ndash; or we could do the reverse order.</p>
<p>One minor question has to do with defaults. Currently the <code>Read</code> trait
requires an implementation of <code>read</code> &ndash; any new method (<code>read_uninit</code>
or whatever) will therefore have to have a default implementation that
invokes <code>read</code>.  But this is sort of the wrong incentive: we&rsquo;d prefer
if users implemented <code>read_uninit</code>, and implemented <code>read</code> in terms of
the new method. We could conceivably reverse the defaults for the
<code>AsyncRead</code> trait to this preferred style. Alternatively, sfackler
noted that we could make <em>both</em> <code>read</code> and <code>read_uninit</code> have a
default implementation, one implementing in terms of the other. In
this case, users would have to implement one or the other
(implementing <em>neither</em> would lead to an infinite loop, and we would
likely want a lint for that case).</p>
<p>We also discussed what it would mean it tokio adopted its own
<code>AsyncRead</code> trait that diverged from std. While not ideal, sfackler
felt like it wouldn&rsquo;t be that big a deal either way, since it ought to
be possible to efficiently interconvert between the two. The main
constraint is having some kind of stateful entity that can remember
the amount of uninitialized data, thus preventing the inefficient
fallover behavior.</p>
<h2 id="is-the-ability-to-use-uninitialized-memory-even-a-problem">Is the ability to use uninitialized memory even a problem?</h2>
<p>We spent a bit of time at the end discussing how one could gain data
on this problem. There are two things that would be nice to know.</p>
<p>First, how big is the performance impact from zeroing? Second, how
ergonomic is the proposed API to use in practice?</p>
<p>Regarding the performance impact, I asked the same question on
<a href="https://github.com/tokio-rs/tokio/pull/1744">tokio-rs/tokio#17144</a>, and I did get back some interesting results,
[which I summarized in this hackmd at the time][tokio-hackmd]. In
short, hyper&rsquo;s benchmarks show a fairly sizable impact, with
uninitialized data getting speedups<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup> of 1.3-1.5x. Other
benchmarks though are much more mixed, showing either no diference or
small differences on the order of 2%. Within the stdlib, we found
about a [7% impact on microbenchmarks][#26950].</p>
<p>Still, sfackler raised another interesting data point (both <a href="https://github.com/tokio-rs/tokio/pull/1744#issuecomment-553179399">on the
thread</a> and in our call). He was pointing out <a href="https://github.com/rust-lang/rust/pull/23820">#23820</a>, a PR which
rewrote <a href="https://doc.rust-lang.org/std/io/trait.Read.html#method.read_to_end"><code>read_to_end</code></a> in the stdlib. The older implementation was
simple and obvious, but suffered from massive performance cliffs
related to the need to zero buffers. The newer implementation is fast,
but much more complex. Using one of the APIs described above would
permit us to avoid this complexity.</p>
<p>Regarding ergonomics, as ever, that&rsquo;s a tricky thing to judge. It&rsquo;s
hard to do better than prototyping as well as offering the API on
nightly for a time, so that people can try it out and give feedback.</p>
<p>Having the API on nightly would also help us to make branches of
frameworks like tokio and async-std so we can do bigger measurements.</p>
<h2 id="higher-levels-of-interoperability">Higher levels of interoperability</h2>
<p>sfackler and I talked a bit about what the priorities should be beyond
<code>AsyncRead</code>. One of the things we talked about is whether there is a
need for higher-level traits or libraries that expose more custom
information beyond &ldquo;here is how to read data&rdquo;. One example that has
come up from time to time is the need to know, for example, the URL or
other information associated with a request.</p>
<p>Another example might be the role of crates like <code>http</code>, which aims to
define Rust types for things like HTTP header codes that are fairly
standard. These would be useful types to share across all HTTP
implementations and libraries, but will we be able to achieve that
sort of sharing without offering the crate as part of the stdlib (or
at last part of the Rust org)? I don&rsquo;t think we had a definitive
answer here.</p>
<h2 id="priorities-beyond-async-read">Priorities beyond async read</h2>
<p>We next discussed what other priorities the Rust org might have
around Async I/O. For sfackler, the top items would be</p>
<ul>
<li>better support for GATs and async fn in traits;</li>
<li>some kind of generator or syntactic support for streams;</li>
<li>improved diagnostics, particularly around send/sync.</li>
</ul>
<h2 id="conclusion">Conclusion</h2>
<p>sfackler and I focused quite heavily on the <code>AsyncRead</code> trait
and how to manage uninitialized memory. I think that it would be
fair to summarize the main points of our conversation as:</p>
<ul>
<li>we should add <code>AsyncRead</code> to the stdlib and have it mirror <code>Read</code>;</li>
<li>in general, it makes sense for the synchronous and asynchronous
versions of the traits to be analogous;</li>
<li>we should extend both traits with a method that takes a <code>BufMut</code>
struct to manage uninitialized output buffers, as the other options
all have a crippling downside;</li>
<li>we should extend both traits with a &ldquo;do you support vectorized output?&rdquo;
callback as well;</li>
<li>beyond that, the Rust org should focus heavily on diagnostics for
async/await, but streams and async fns in traits would be great
too. =)</li>
</ul>
<h2 id="comments">Comments?</h2>
<p>There is a <a href="https://users.rust-lang.org/t/async-interviews/35167/">thread on the Rust users forum</a> for this series.</p>
<h2 id="appendix-vectorized-reads-and-writes">Appendix: Vectorized reads and writes</h2>
<p>There is one minor subthread that I&rsquo;ve skipped over &ndash; vectorized
reads and writes. I skipped it in the blog post because this problem
is somewhat simpler. The standard <code>read</code> interface takes a single
buffer to write the data into. But a <em>vectorized</em> interface takes a
series of buffers &ndash; if there is more data than will fit in the first
one, then the data will be written into the second one, and so on
until we run out of data or buffers. Vectorized reads and writes can
be much more efficient in some cases.</p>
<p>Unfortunately, not all readers support vectorized reads. For that
reason, the &ldquo;vectorized read&rdquo; method has a fallback: by default, it
just calls the normal read method using the first non-empty buffer in
the list. This is theoretically equal, but obviously it could be a lot
less efficient &ndash; imagine that I have supplied one buffer of size 1K
and one buffer of size 16K. The default vectorized read method will
just always use that single 1K buffer, which isn&rsquo;t great &ndash; but still,
not much to be done about it. Some readers just cannot support
vectorized reads.</p>
<p>The problem here then is that it would be nice if there were some way
to <em>detect</em> when a reader supports vectorized reads. This would allow
the caller to choose between a &ldquo;vectorized&rdquo; call path, where it tries
to supply many buffers, or a single-buffer call path, where it just
allocates a big buffer.</p>
<p>Apparently hyper will do this today, but using a heuristic: if a call
to the vectorized read method returns <em>just enough</em> data to fit in the
first buffer, hyper guesses that in fact vectorized reads are not
supported, and switches dynamically to the &ldquo;one big buffer&rdquo; strategy.
(Neat.)</p>
<p>There is perhaps a second, more ergonomic issue: since the vectorized
read method has a default implementation, it is easy to forget to
implement it, even if you would have been able to do so.</p>
<p>In any case, this problem is relatively easy to solve: we basically
need to add a new method like</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">supports_vectorized_reads</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">bool</span>
</span></span></code></pre></div><p>to the trait.</p>
<p>The matter of decided whether or not to supply a default is a bit
trickier.  If you don&rsquo;t supply a default, then everybody has to
implement it, even if they just want the default behavior. But if you
<em>do</em>, people who wished to implement the method may forget to do so &ndash;
this is particularly unfortunate for reads that are wrapping another
reader, which is a pretty common case.</p>
<h2 id="footnotes">Footnotes</h2>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Most folks say &ldquo;object-safe&rdquo; here, but I&rsquo;m trying to shift our terminology to talk more about the dyn keyword.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Carl Lerche proposed something similar on the tokio thread <a href="https://github.com/tokio-rs/tokio/pull/1744#issuecomment-553575438">here</a>.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>I am defining a &ldquo;speedup&rdquo; here as the ratio of <code>U/Z</code>, where <code>U/Z</code> are the throughput with uninitialized/zeroed buffers respectively.
[tokio-hackmd]: <a href="https://hackmd.io/ukeyehx7Ta-6KhaVRFi2mg#Measuring-the-impact">https://hackmd.io/ukeyehx7Ta-6KhaVRFi2mg#Measuring-the-impact</a>
[#26950]: <a href="https://github.com/rust-lang/rust/pull/26950">https://github.com/rust-lang/rust/pull/26950</a>&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/asyncinterviews" term="asyncinterviews" label="AsyncInterviews"/></entry><entry><title type="html">Async Interview #4: Florian Gilcher</title><link href="https://smallcultfollowing.com/babysteps/blog/2020/01/13/async-interview-4-florian-gilcher/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2020/01/13/async-interview-4-florian-gilcher/</id><published>2020-01-13T00:00:00+00:00</published><updated>2020-01-13T00:00:00+00:00</updated><content type="html"><![CDATA[<p>Hello! For the latest <a href="http://smallcultfollowing.com/babysteps/blog/2019/11/22/announcing-the-async-interviews/">async interview</a>, I spoke with Florian Gilcher
(<a href="https://github.com/skade/">skade</a>). Florian is involved in the <a href="https://async.rs">async-std</a> project, but he&rsquo;s
also one of the founders of <a href="https://ferrous-systems.com/">Ferrous Systems</a>, a Rust consulting firm
that also does a lot of trainings. In that capacity, he&rsquo;s been
teaching people to use async Rust now since Rust&rsquo;s 1.0 release.</p>
<h3 id="video">Video</h3>
<p>You can watch the <a href="https://youtu.be/Ezwd1vKSfCo">video</a> on YouTube. I&rsquo;ve also embedded a copy here
for your convenience:</p>
<center><iframe width="560" height="315" src="https://www.youtube.com/embed/Ezwd1vKSfCo" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe></center>
<p>One note: something about our setup meant that I was hearing a lot of
echo. I think you can sometimes hear it in the recording, but not
nearly as bad as it was live. So if I seem a bit spacey, or take very
long pauses, you might know the reason why!</p>
<h3 id="prioritize-stability-readwrite-traits">Prioritize stability, read/write traits</h3>
<p>The first thing we discussed was some background on async-std
itself. From there we started talking about what the Rust org ought to
prioritize. Florian felt like having stable, uniform <code>AsyncRead</code> and
<code>AsyncWrite</code> traits would be very helpful, as most applications are
interested in having access to a &ldquo;readable/writable thing&rdquo; but don&rsquo;t
care that much where the bytes are coming from.</p>
<p>He felt that <code>Stream</code>, while useful, might be somewhat lower priority.
The main reason was that while streams are useful, in many of the
applications that he&rsquo;s seen, there wasn&rsquo;t as much need to be <em>generic</em>
over a stream. Of course, having a standard <code>Stream</code> trait would still
be of use, and would enable libraries as well, so it&rsquo;s not an argument
not to do it, just a question of how to prioritize.</p>
<h3 id="prioritize-diagnostics-perhaps-even-more">Prioritize diagnostics perhaps even more</h3>
<p>Although we&rsquo;ve done a lot of work on it, there continues to be a need
for improved error diagnostics. This kind of detailed ergonomics work may indeed
be the highest priority overall.</p>
<p>(A quick plug for the <a href="https://rust-lang.github.io/compiler-team/working-groups/async-await/">async await working
group</a>,
which has been steadily making progress here. Big thanks especially to
tmandry, who has been running the triage meetings lately, but also (in
no particular order) csmoe, davidtwco, gilescope, and centril &ndash; and
perhaps others I&rsquo;ve forgotten (sorry!).)</p>
<h3 id="levels-of-stability-and-the-futures-crate">Levels of stability and the futures crate</h3>
<p>We discussed the futures crate for a while. In particular, the
question of whether we should be &ldquo;stabilizing&rdquo; traits by moving them
into the standard library, or whether we can use the futures crate as
a &ldquo;semi-stable&rdquo; home. There are obviously advantages either way.</p>
<p>On the one hand, there is no clearer signal for stability than adding
something to libstd. On the other, the future crate facade gives a
&ldquo;finer grained&rdquo; ability to talk about semver.</p>
<p>One thing Florian noted is that the futures crate itself, although it
has evolved a lot, has always maintained an internal consistency,
which is good.</p>
<p>One other point Florian emphasized is that people really want to be
building applications, so in some way the most important thing is to
be moving towards stability, so they can avoid worrying about the sand
shifting under their feet.</p>
<h3 id="deprioritize-attached-and-detached-streams">Deprioritize: Attached and detached streams</h3>
<p>I asked Florian how much he thought it made sense to wait on things
like streams until the GAT story is straightened out, so that we might
have support for &ldquo;attached&rdquo; streams. He felt like it would be better
to move forward with what we have now, and consider extensions
later.</p>
<p>He noted an occasional tendency to try and create the single, perfect
generic abstraction that can handle everything &ndash; while this can be
quite elegant, it can sometimes also lead to really confusing
interfaces that are complex to use.</p>
<h3 id="deprioritize-special-syntax-for-streams">Deprioritize: Special syntax for streams</h3>
<p>I asked about syntactic support for generators, but Florian felt that
it was too early to prioritize that, and that it would be better to
focus first on the missing building blocks.</p>
<h3 id="the-importance-of-building-and-discovering-patterns">The importance of building and discovering patterns</h3>
<p>Florian felt that we&rsquo;re now in a stage where we&rsquo;re transitioning a
little. Until now, we&rsquo;ve been tinkering about with the most primitive
layers of the async ecosystem, such as the <code>Future</code> trait, async-await
syntax, etc. As these primitives are stabilized, we&rsquo;re going to see a
lot more tinkering with the &ldquo;next level up&rdquo; of patterns. These might
be questions like &ldquo;how do I stop a stream?&rdquo;, or &ldquo;how do I construct my app?&rdquo;.
But it&rsquo;s going to be hard for people to focus on these higher-level patterns
(and in particular to find new, innovative solutions to them) until the
primitives even out.</p>
<p>As these patterns evolve, they can be extracted into crates and types
and shared and reused in many contexts. He gave the example of the
<a href="https://docs.rs/async-task/newest/async_task/">async-task</a> crate, which extracts out quite a bit of the complexity
of managing allocation of an async task. This allows other runtimes to reuse that
fairly standard logic. (Editor&rsquo;s note: If you haven&rsquo;t seen async-task,
you should check it out, it&rsquo;s quite cool.)</p>
<h3 id="odds-and-ends">Odds and ends</h3>
<p>We then discussed a few other features and how much to prioritize them.</p>
<p><strong>Async fn in traits.</strong> Don&rsquo;t rush it, the async-trait crate is a
pretty reasonable practice and we can probably &ldquo;get by&rdquo; with that for
quite a while.</p>
<p><strong>Async closures.</strong> These can likely wait too, but they would be
useful for stabilzing convenience combinators. On the other hand,
those combinators often come attached to the base libaries you&rsquo;re
using.</p>
<h3 id="communicating-over-the-futures-crate">Communicating over the futures crate</h3>
<p>Returning to the futures crate, I raised the question of how best to
help convey its design and stability requirements. I&rsquo;ve noticed that there
is a lot of confusion around its various parts and how they are meant
to be used.</p>
<p>Florian felt like one thing that might be helpful is to break apart
the facade pattern a bit, to help people see the smaller
pieces. Currently the futures crate seems a bit like a monolithic
entity. Maybe it would be useful to give more examples of what each
part is and how it can be used in isolation, or the overall best
practices.</p>
<h3 id="learning">Learning</h3>
<p>Finally, I posed to Florian a question of how can help people to learn
async coding. I&rsquo;m very keen on the way that Rust manages to avoid
hard-coding a single runtime, but one of the challenges that comes
with that is that it is hard to teach people how to use futures
without referencing a runtime.</p>
<p>We didn&rsquo;t solve this problem (shocker that), but we did talk some
about the general value in having a system that doesn&rsquo;t make all the
choices for you. To be quite honest I remember that at this point I
was getting very tired. I haven&rsquo;t listened back to the video because
I&rsquo;m too afraid, but hopefully I at least used complete sentences. =)</p>
<p>One interesting idea that Florian raised is that it might be really
useful for people to create a &ldquo;learning runtime&rdquo; that is oriented not
at performance but at helping people to understand how futures work or
their own applications. Such a runtime might gather a lot of data, do
tracing, or otherwise help in visualizing. Reading back over my notes,
I personally find that idea sort of intriguing, particularly if the
focus is on helping people learn how futures work early on &ndash; i.e., I
don&rsquo;t think we&rsquo;re anywhere close to the point where you could take
production app written against async-std and then have it use this
debugging runtime. But I could imagine having a &ldquo;learner&rsquo;s runtime&rdquo;
that you start with initially, and then once you&rsquo;ve got a feel for
things, you can move over to more complex runtimes to get better
performance.</p>
<h3 id="conclusion">Conclusion</h3>
<p>I think the main points from the conversation were:</p>
<ul>
<li>Diagnostics and documentation remain of very high importance. We
shouldn&rsquo;t get all dazzled with new, shiny things &ndash; we have to keep
working on polish.</li>
<li>Beyond that, though, we should be working to stabilize building
blocks so as to give more room for the ecosystem to flourish and
develop. The <code>AsyncRead/AsyncWrite</code> traits, along with <code>Stream</code>,
seem like plausible candidates.
<ul>
<li>We shouldn&rsquo;t necessarily try to make those traits be as generic as
possible, but instead focus on building something usable and
simple that meets the most important needs right now.</li>
</ul>
</li>
<li>We need to give time for people to develop patterns and best
practices, and in particular to figure out how to &ldquo;capture&rdquo; them as
APIs and crates.  This isn&rsquo;t really something that the <em>Rust
organization</em> can do, it comes from the ecosystem, by library and
application developers.</li>
</ul>
<h2 id="comments">Comments?</h2>
<p>There is a <a href="https://users.rust-lang.org/t/async-interviews/35167/">thread on the Rust users forum</a> for this series.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/asyncinterviews" term="asyncinterviews" label="AsyncInterviews"/></entry><entry><title type="html">Towards a Rust foundation</title><link href="https://smallcultfollowing.com/babysteps/blog/2020/01/09/towards-a-rust-foundation/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2020/01/09/towards-a-rust-foundation/</id><published>2020-01-09T00:00:00+00:00</published><updated>2020-01-09T00:00:00+00:00</updated><content type="html"><![CDATA[<p>In <a href="http://smallcultfollowing.com/babysteps/blog/2019/12/02/rust-2020/">my #rust2020 blog post</a>, I mentioned rather off-handedly
that I think the time has come for us to talk about forming a Rust
foundation. I wanted to come back to this topic and talk in more
detail about what I think a Rust foundation might look like. And,
since I don&rsquo;t claim to have the final answer to that question by any
means, I&rsquo;d also like to talk about <strong>how</strong> I think we should have this
conversation going forward.</p>
<h3 id="hat-tip">Hat tip</h3>
<p>Before going any further, I want to say that most of the ideas in this
post arose from conversations with others. In particular, Florian
Gilcher, Ryan Levick, Josh Triplett, Ashley Williams, and I have been
chatting pretty reguarly, and this blog post generally reflects the
consensus that we seemed to be arriving at (though perhaps they will
correct me). Thanks also to Yehuda Katz and Till Schneidereit for lots
of detailed discussions.</p>
<h3 id="why-do-we-want-a-rust-foundation">Why do we want a Rust foundation?</h3>
<p>I think this is in many ways the most important question for us to
answer: what is it that we hope to achieve by creating a Rust
foundation, anyway?</p>
<p>To me, there are two key goals:</p>
<ul>
<li>to help clarify Rust&rsquo;s status as an independent project, and thus
encourage investment from more companies;</li>
<li>to alleviate some practical problems caused by Rust not having a
&ldquo;legal entity&rdquo; nor a dedicated bank account.</li>
</ul>
<p>There are also some anti-goals. Most notably:</p>
<ul>
<li>the foundation should not replace the existing Rust teams as a
decision-making apparatus.</li>
</ul>
<p>The role of the foundation is to complement the teams and to help us
in achieving our goals. It is not to set the goals themselves.</p>
<h3 id="start-small-and-iterate">Start small and iterate</h3>
<p>You&rsquo;ll notice that I&rsquo;ve outlined a fairly narrow role for the
foundation. This is no accident. When designing a foundation, just as
when designing many other things, I think it makes sense for us to
move carefully, a step at a time.</p>
<p>We should try to address immediate problems that we are facing and
then give those changes some time to &ldquo;sink in&rdquo;. We should also take
time to experiment with some of the various funding possibilities that
are out there (some of which I&rsquo;ll discuss later on). Once we&rsquo;ve had
some more experience, it should be easier for us to see which next
steps make sense.</p>
<p>Another reason to start small is being able to move more quickly. I&rsquo;d
like to see us setup a foundation like the one I am discussing as soon
as this year.</p>
<h3 id="goal-1-clarifying-rusts-status-as-an-independent-project">Goal #1: Clarifying Rust&rsquo;s status as an independent project</h3>
<p>So let&rsquo;s talk a bit more about the two goals that I set forth for a
Rust foundation. The first was to clarify Rust&rsquo;s status as an
independent project. In some sense, this is nothing new. Mozilla has
from the get-go attempted to create an independent governance
structure and to solicit involvement from other companies, because we
know this makes Rust a better language for everyone.</p>
<p>Unfortunately, there is sometimes a lingering perception that Mozilla
&ldquo;owns&rdquo; Rust, which can discourage companies from getting invested, or
create the perception that there is no need to support Rust since
Mozilla is footing the bill. Establishing a foundation will make
official what has been true in practice for a long time: that Rust is
an independent project.</p>
<p>We have also heard a few times from companies, large and small, who
would like to support Rust financially, but right now there is no
clear way to do that. Creating a foundation creates a place where that
support can be directed.</p>
<h3 id="mozilla-wants-to-support-rust-just-not-alone">Mozilla wants to support Rust&hellip; just not alone</h3>
<p>Now, establishing a Rust foundation doesn&rsquo;t mean that Mozilla plans to
step back. After all, Mozilla has a lot riding on Rust, and Rust is
playing an increasingly important role in how Mozilla builds our
products. What we really want is a scenario where other companies join
Mozilla in supporting Rust, letting us do much more.</p>
<p>In truth, this has already started to happen. For example, just this
year <a href="https://internals.rust-lang.org/t/update-on-the-ci-investigation/10056/9?u=nikomatsakis">Microsoft started sponsoring Rust&rsquo;s CI costs</a> and
<a href="https://aws.amazon.com/blogs/opensource/aws-sponsorship-of-the-rust-project/">Amazon is paying Rust&rsquo;s S3 bills</a>. In fact, we recently
added a <a href="https://www.rust-lang.org/sponsors">corporate sponsors</a> page to the Rust web site to
acknowledge the many companies that are starting to support Rust.</p>
<h3 id="goal-2-alleviating-some-practical-difficulties">Goal #2: Alleviating some practical difficulties</h3>
<p>While the Rust project has its own governance system, it has never had
its own distinct legal entity. That role has always been played by
Mozilla. For example, Mozilla owns the Rust trademarks, and Mozilla is
the legal operator for services like crates.io. This means that
Mozilla is (in turn) responsible for ensuring that DMCA requests
against those services are properly managed and so forth. For a long
time, this arrangement worked out quite well for Rust. Mozilla Legal,
for example, provided excellent help in drafting Rust&rsquo;s trademark
agreements and coached us through how to handle DMCA takedown requests
(which thankfully have arisen quite infrequently).</p>
<p>Lately, though, the Rust project has started to hit the limits of what
Mozilla can reasonably support. One common example that arises is the
need to have some entity that can legally sign contracts &ldquo;for the Rust
project&rdquo;. For example, we wished recently to sign up for Github&rsquo;s
<a href="https://developer.github.com/partnerships/token-scanning/">Token Scanning</a> program, but we weren&rsquo;t able to figure out who ought
to sign the contract.</p>
<p>Is token scanning by itself a burning problem? No. We could probably
work out a solution for it, and for other similar cases that have
arisen, such as deciding who should sign Rust binaries. But it might
be a sign that it is time for the Rust project to have its own legal
entity.</p>
<h3 id="another-practical-difficulty-rust-has-no-bank-account">Another practical difficulty: Rust has no bank account</h3>
<p>Another example of a &ldquo;practical difficulty&rdquo; that we&rsquo;ve encountered is
that Rust has no bank account. This makes it harder for us to
arrange for joint sponsorship and support of events and other programs
that the Rust program would like to run. The most recent example is
the Rust All Hands. Whereas in the past Mozilla has paid for the
venue, catering, and much of the airfare by itself, this year we are
trying to &ldquo;share the load&rdquo; and have multiple companies provide
sponsorship. However, this requires a bank account to collect and pool
funds. We have solved the problem for this year, but it would be
easier if the Rust organization had a bank account of its own. I
imagine we would also make use of a bank account to fund other sorts
of programs, such as Increasing Rust&rsquo;s Reach.</p>
<h3 id="on-paying-people-and-contracting">On paying people and contracting</h3>
<p>One area where I think we should move slowly is on the topic of
employing people and hiring contractors. As a practical matter, the
foundation is probably going to want to employ some people. For
example, I suspect we need an &ldquo;operations manager&rdquo; to help us keep the
wheels turning (this is already a challenge for the core team, and
it&rsquo;s only going to get worse as the project grows). We may also want
to do some limited amount of contracting for specific purposes (e.g.,
to pay for someone to run a program like Increasing Rust&rsquo;s Reach, or
to help do data crunching on the Rust survey).</p>
<h3 id="the-rust-foundation-should-not-hire-developers-at-least-to-start">The Rust foundation should not hire developers, at least to start</h3>
<p>But I don&rsquo;t think the Rust foundation should do anything like hiring
full-time developers, at least not to start. I would also avoid trying
to manage larger contracts to hack on rustc. There are a few reasons
for this, but the biggest one is simply that it is
<strong>expensive</strong>. Funding that amount of work will require a significant
budget, which will require significant fund-raising.</p>
<p>Managing a large budget, as well as employees, will also require more
superstructure. If we hire developers, who decides what they should
work on?  Who decides when it&rsquo;s time to hire? Who decides when it&rsquo;s
time to <em>fire</em>?</p>
<p>This is a bit difficult: on the one hand, I think there is a strong
need for more people to get paid for their work on Rust. On the other
hand, I am not sure a foundation is the right institution to be paying
them; even if it were, it seems clear that we don&rsquo;t have enough
experience to know how to answer the sorts of difficult questions that
will arise as a result. Therefore, I think it makes sense to fall back
on the approach to &ldquo;start small and iterate&rdquo; here. Let&rsquo;s create a
foundation with a limited scope and see what difference it makes
before we make any further decisions.</p>
<h3 id="some-other-things-the-foundation-wouldnt-do">Some other things the foundation wouldn&rsquo;t do</h3>
<p>I think there are a variety of other things that a hypothetical
foundation should not do, at least not to start. For example, I think
the foundation should not pay for local meetups nor sponsor Rust
conferences. Why?  Well, for one thing, it&rsquo;ll be hard for us to come
up with criteria on when to supply funds and when not to. For another,
both meetups and conferences I think will do best if they can forge
strong relationships with companies directly.</p>
<p>However, even if there are things that the Rust foundation wouldn&rsquo;t
fund or do directly, I think it makes a lot of sense to collect a list
of the kinds of things it <em>might</em> do. If nothing else, we can try to
offer suggestions for where to find funding or obtain support, or
perhaps offer some lightweight &ldquo;match-making&rdquo; role.</p>
<h3 id="we-should-strive-to-have-many-kinds-of-rust-sponsorship">We should strive to have many kinds of Rust sponsorship</h3>
<p>Overall, I am nervous about a situation in which a Rust Foundation
comes to have a kind of &ldquo;monopoly&rdquo; on supporting the Rust project or
Rust-flavored events. I think it&rsquo;d be great if we can encourage a
wider variety of setups. First and foremost, I&rsquo;d like to see more
companies that use Rust hiring people whose job description is to
support the Rust project itself (at least in part). But I think it
could also work to create &ldquo;trade associations&rdquo; where multiple
companies pool funds to hire Rust developers. If nothing else, it is
worth experimenting with these sorts of setups to help gain
experience.</p>
<h3 id="we-should-create-a-project-group-to-figure-this-out">We should create a &ldquo;project group&rdquo; to figure this out</h3>
<p>Creating a foundation is a complex task. In this blog post, I&rsquo;ve just
tried to sketch the &ldquo;high-level view&rdquo; of what responsiblities I think
a foundation might take on and why (and which I think we should avoid
or defer). But I left out a lot of interesting details: for example,
should the Foundation be a 501(c)(3) (a non-profit, in other words) or
not? Should we join an umbrella organization and &ndash; if so &ndash; which
one?</p>
<p>The traditional way that the Rust project makes decisions, of course,
is through RFCs, and I think that a decision to create a foundation
should be no exception. In fact, I do plan to open an RFC about
creating a foundation soon. However, I <strong>don&rsquo;t</strong> expect this RFC to
try to spell out all the details of how a foundation would
work. Rather, I plan to propose creating a <strong>project group</strong> with the
goal of answering those questions.</p>
<p>In short, I think the core team should select some set of folks who
will explore the best design for a foundation. Along the way, we&rsquo;ll
keep the community updated with the latest ideas and take feedback,
and &ndash; in the end &ndash; we&rsquo;ll submit an RFC (or perhaps a series of RFCs)
with a final plan for the core team to approve.</p>
<h3 id="feedback">Feedback</h3>
<p>OK, well, enough about what I think. I&rsquo;m very curious (and a bit
scared, I won&rsquo;t lie) to hear what people think about the contents of
this post. To collect feedback, I&rsquo;ve created a <a href="https://internals.rust-lang.org/t/blog-post-towards-a-rust-foundation/11601">thread on
internals</a>. As ever, I&rsquo;ll read all the responses, and I&rsquo;ll do my best
to respond where I can. Thanks!</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/foundation" term="foundation" label="Foundation"/></entry><entry><title type="html">Async Interview #3: Carl Lerche</title><link href="https://smallcultfollowing.com/babysteps/blog/2019/12/23/async-interview-3-carl-lerche/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2019/12/23/async-interview-3-carl-lerche/</id><published>2019-12-23T00:00:00+00:00</published><updated>2019-12-23T00:00:00+00:00</updated><content type="html"><![CDATA[<p>Hello! For the latest <a href="http://smallcultfollowing.com/babysteps/blog/2019/12/09/async-interview-2-cramertj/">async interview</a>, I spoke with Carl Lerche
(<a href="https://github.com/carllerche/">carllerche</a>). Among many other crates<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>, Carl is perhaps best
known as one of the key authors behind <a href="https://github.com/tokio-rs/tokio">tokio</a> and <a href="https://github.com/tokio-rs/mio">mio</a>. These two
crates are quite widely used through the async ecosystem. Carl and I
spoke on December 3rd.</p>
<h3 id="video">Video</h3>
<p>You can watch the <a href="https://youtu.be/xpk0y8tfszE">video</a> on YouTube. I&rsquo;ve also embedded a copy here
for your convenience:</p>
<center><iframe width="560" height="315" src="https://www.youtube.com/embed/xpk0y8tfszE" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe></center>
<h2 id="background-the-mio-crate">Background: the mio crate</h2>
<p>One of the first things we talked about was a kind of overview of the
layers of the &ldquo;tokio-based async stack&rdquo;.</p>
<p>We started with the <a href="https://github.com/tokio-rs/mio">mio</a> crate. <a href="https://github.com/tokio-rs/mio">mio</a> is meant to be the &ldquo;lightest
possible&rdquo; non-blocking I/O layer for Rust. It basically exposes the
&ldquo;epoll&rdquo; interface that is widely used on linux. Windows uses a
fundamentally different model, so in that case there is a kind of
compatibility layer, and hence the performance isn&rsquo;t quite as good,
but it&rsquo;s still pretty decent. mio &ldquo;does the best it can&rdquo;, as Carl put
it.</p>
<p>The <a href="https://github.com/tokio-rs/tokio">tokio</a> crate builds on <a href="https://github.com/tokio-rs/mio">mio</a>. It wraps the epoll interface and
exposes it via the <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code>Future</code></a> abstraction from <code>std</code>. It also offers
other things that people commonly need, such as timers.</p>
<p>Finally, bulding atop tokio you find <a href="https://crates.io/crates/tower">tower</a>, which exposes a
&ldquo;request-response&rdquo; abstraction called <a href="https://docs.rs/tower/0.3.0/tower/trait.Service.html"><code>Service</code></a>. <a href="https://crates.io/crates/tower">tower</a> is similar
to things like <a href="https://twitter.github.io/finagle/">finagle</a> or <a href="https://rack.github.io/">rack</a>. This is then used by libraries
like <a href="https://crates.io/crates/hyper">hyper</a> and <a href="https://crates.io/crates/tonic">tonic</a>, which implement protocol servers (http for
<a href="https://crates.io/crates/hyper">hyper</a>, gRPC for <a href="https://crates.io/crates/tonic">tonic</a>). These protocol servers internally use the
<a href="https://crates.io/crates/tower">tower</a> abstractions as well, so you can tell hyper to execute any
<a href="https://docs.rs/tower/0.3.0/tower/trait.Service.html"><code>Service</code></a>.</p>
<p>One challenge is that it is not yet clear how to adapt tower&rsquo;s
<a href="https://docs.rs/tower/0.3.0/tower/trait.Service.html"><code>Service</code></a> trait to <code>std::Future</code>. It would really benefit from
support of async functions in traits, in particular, which <a href="http://smallcultfollowing.com/babysteps/blog/2019/10/26/async-fn-in-traits-are-hard/">is
difficult for a lot of reasons</a>. The current plan is to adopt
<a href="https://doc.rust-lang.org/std/pin/struct.Pin.html"><code>Pin</code></a> and to require boxing and <code>dyn Future</code> values if you wish to
use the <code>async fn</code> sugar. (Which seems like a good starting place,
-ed.)</p>
<p>Returning to the overall async stack, atop protocol servers like
hyper, you find web frameworks, such as <a href="https://crates.io/crates/warp">warp</a> &ndash; and (finally) within
those you have middleware and the actual applications.</p>
<h2 id="how-independent-are-these-various-layers">How independent are these various layers?</h2>
<p>I was curious to understand how &ldquo;interconnected&rdquo; these various crates
were. After all, while tokio is widely used, there are a number of
different executors out there, both targeting different platforms
(e.g., <a href="https://fuchsia.googlesource.com/">Fuchsia</a>) as well as different trade-offs (e.g., <a href="https://async.rs/">async-std</a>).
I&rsquo;m really interested to get a better understanding of what we can do
to help the various layers described above operate independently, so
that people can mix-and-match.</p>
<p>To that end, I asked Carl what it would take to use (say) Warp on
Fuchsia. The answer was that &ldquo;in principle&rdquo; the point of Tower is to
create just such a decoupling, but in practice it might not be so
easy.</p>
<p>One of the big changes in the upcoming tokio 0.2 crate, in fact, has
been to combine and merge a lot of tokio into one crate. Previously,
the components were more decoupled, but people rarely took advantage
of that. Therefore, tokio 0.2 combined a lot of components and made
the experience of using them together more streamlined, although it is
still possible to use components in a more &ldquo;standalone&rdquo; fashion.</p>
<p>In general, to make tokio work, you need some form of &ldquo;driver thread&rdquo;.
Typically this is done by spawning a background thread, you can skip
that and run the driver yourself.</p>
<p>The original tokio design had a static global that contained this
driver information, but this had a number of issues in practice: the
driver sometimes started unexpectedly, it could be hard to configure,
and it didn&rsquo;t work great for embedded environments. Therefore, the new
system has switched to an explicitly launch, though there are
procedural macros <code>#[tokio::main]</code> or <code>#[tokio::test]</code> that provide
sugar if you prefer.</p>
<h2 id="what-should-we-do-next-stabilize-stream">What should we do next? Stabilize stream.</h2>
<p>Next we discussed which concrete actions made sense next. Carl felt
that an obvious next step would be to stabilize the <code>Stream</code> trait.
As you may recall, cramertj and I <a href="http://smallcultfollowing.com/babysteps/blog/2019/12/10/async-interview-2-cramertj-part-2/">discussed the <code>Stream</code> trait</a>
in quite a lot of detail &ndash; in short, the existing design for <code>Stream</code>
is &ldquo;detached&rdquo;, meaning that it must yield up ownership of each item it
produces, much like an <code>Iterator</code>. It would be nice to figure out the
story for &ldquo;attached&rdquo; streams that can re-use internal buffers, which
are a very common use case, especially before we create syntactic
sugar.</p>
<p>Carl&rsquo;s motivation for a stable <a href="https://docs.rs/futures/0.3.1/futures/stream/trait.Stream.html"><code>Stream</code></a> is in part that he would like
to issue a stable tokio release, ideally in Q3 of 2020, and <a href="https://docs.rs/futures/0.3.1/futures/stream/trait.Stream.html"><code>Stream</code></a>
would be a part of that. If there is no <a href="https://docs.rs/futures/0.3.1/futures/stream/trait.Stream.html"><code>Stream</code></a> trait in the standard
libary, that complicates things.</p>
<p>One thing we <em>didn&rsquo;t</em> discuss, but which I personally would like to
understand better, is what sort of libraries and infrastructure might
benefit from a stabilized <a href="https://docs.rs/futures/0.3.1/futures/stream/trait.Stream.html"><code>Stream</code></a>. For example, &ldquo;data libraries&rdquo; like
hyper mostly want a trait like <a href="https://docs.rs/futures/0.3.1/futures/io/trait.AsyncRead.html"><code>AsyncRead</code></a> to be stabilized.</p>
<h2 id="about-async-read">About async read</h2>
<p>Next we discussed the <a href="https://docs.rs/futures/0.3.1/futures/io/trait.AsyncRead.html"><code>AsyncRead</code></a> trait a little, though not in great
depth. If you&rsquo;ve been following the latest discussion, you&rsquo;ll have seen
that there is a <a href="https://github.com/tokio-rs/tokio/pull/1744">tokio proposal</a>
to modify the <code>AsyncRead</code> traits used within tokio. There are two main goals here:</p>
<ul>
<li>to make it safe to pass an uninitialized memory buffer to <code>read</code></li>
<li>to better support vectorizing writes</li>
</ul>
<p>However, there isn&rsquo;t a clear consensus on the thread (at least not the
last time I checked) on the best alternative design. The PR itself
proposes changing from a <code>&amp;mut [u8]</code> buffer (for writing the output
into) to a <code>dyn</code> trait value, but there are other options. Carl for
example <a href="https://github.com/tokio-rs/tokio/pull/1744#issuecomment-553575438">proposed</a> using a concrete wrapper struct instead, and adding
methods to test for vectorization support (since outer layers may wish
to adopt different strategies based on whether vectorization works).</p>
<p>One of the arguments in favor of the current design from the futures
crate is that it maps very cleanly to the <code>Read</code> trait from the stdlib
([cramertj advanced this argument][c3], for example). Carl felt that
the trait is already quite different (e.g., notably, it uses <code>Pin</code>)
and that these more &ldquo;analogous&rdquo; interfaces could be made with
defaulted helper methods instead. Further, he felt that async
applications tend to prize performance more highly than synchronous
ones, so the importance and overhead of uninitialized memory may be
higher.</p>
<h2 id="about-async-destructors-and-other-utilities">About async destructors and other utilities</h2>
<p>We discussed async destructors. Carl felt that they would be a
valuable thing to add for sure. He felt that the <a href="https://boats.gitlab.io/blog/post/poll-drop/">&ldquo;general design&rdquo;
proposed by boats</a> would
be reasonable, although he thought there might be a bit of a
duplication issue if you have both a async drop and a sync drop. A
possible solution would be to have a <code>prepare_to_drop</code> async method
that gives the object time to do async preparations, and then to
always run the sync drop afterwards.</p>
<p>We also discussed a few utility methods like <code>select!</code>, and Carl
mentioned that a lot of the ecosystem is currently using things like
<a href="https://crates.io/crates/proc-macro-hack">proc-macro-hack</a> to support these, so perhaps a good thing to focus
on would be improving procedural macro support so that it can handle
expression level macros more cleanly.</p>
<h2 id="comments">Comments?</h2>
<p>There is a <a href="https://users.rust-lang.org/t/async-interviews/35167/">thread on the Rust users forum</a> for this series.</p>
<h2 id="footnotes">Footnotes</h2>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>I think [loom] looks particularly cool.
[loom]: <a href="https://crates.io/crates/loom">https://crates.io/crates/loom</a>&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/asyncinterviews" term="asyncinterviews" label="AsyncInterviews"/></entry><entry><title type="html">Async Interview #2: cramertj, part 3</title><link href="https://smallcultfollowing.com/babysteps/blog/2019/12/11/async-interview-2-cramertj-part-3/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2019/12/11/async-interview-2-cramertj-part-3/</id><published>2019-12-11T00:00:00+00:00</published><updated>2019-12-11T00:00:00+00:00</updated><content type="html"><![CDATA[<p>This blog post is continuing <a href="http://smallcultfollowing.com/babysteps/blog/2019/12/09/async-interview-2-cramertj/">my conversation with
cramertj</a>. This
will be the last post.</p>
<p>In the <a href="http://smallcultfollowing.com/babysteps/blog/2019/12/09/async-interview-2-cramertj/">first post</a>, I covered what we said about Fuchsia,
interoperability, and the organization of the futures crate.</p>
<p>In the <a href="http://smallcultfollowing.com/babysteps/blog/2019/12/10/async-interview-2-cramertj-part-2/">second post</a>, I covered cramertj&rsquo;s take on the <a href="https://docs.rs/futures-core/0.3.1/futures_core/stream/trait.Stream.html"><code>Stream</code></a>,
<a href="https://docs.rs/futures/0.3.1/futures/io/trait.AsyncRead.html"><code>AsyncRead</code></a>, and <a href="https://docs.rs/futures/0.3.1/futures/io/trait.AsyncWrite.html"><code>AsyncWrite</code></a> traits. We also discused the idea of
<a href="http://smallcultfollowing.com/babysteps/blog/2019/12/10/async-interview-2-cramertj-part-2/#terminology-note-detachedattached-instead-of-streaming">attached</a> streams and the imporance of GATs for modeling those.</p>
<p>In this post, we&rsquo;ll talk about async closures.</p>
<p>You can watch the <a href="https://youtu.be/NF_qyiypnOs">video</a> on YouTube.</p>
<h3 id="async-closures">Async closures</h3>
<p>Next we discussed async closures. You may have noticed that while you
can write an <code>async fn</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">foo</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>you cannot write the analogous syntax with closures:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">foo</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">async</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="o">..</span><span class="p">.;</span><span class="w">
</span></span></span></code></pre></div><p>Such a thing would often be useful, especially when writing the
combinators on futures and streams that one might expect (like <code>map</code>
and so forth). Unfortunately, async closures turn out to be somewhat
more complex than their synchronous counterparts &ndash; to get the
behavior we probably want, it turns out that they too would require
some support for generic associated types (GAT), because they sort of
want to be &ldquo;<a href="http://smallcultfollowing.com/babysteps/blog/2019/12/10/async-interview-2-cramertj-part-2/#terminology-note-detachedattached-instead-of-streaming">attached</a> closures&rdquo;.</p>
<h3 id="an-example-using-iterator">An example using iterator</h3>
<p>To see the problem, let&rsquo;s start with a synchronous example using
<code>Iterator</code>. Here is some code that uses <code>for_each</code> to process each
datum in the iterator and &ndash; along the way &ndash; it increments a counter
found on the stack:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">process_count</span><span class="p">(</span><span class="n">iterator</span>: <span class="nc">impl</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Datum</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">counter</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">iterator</span><span class="p">.</span><span class="n">for_each</span><span class="p">(</span><span class="o">|</span><span class="n">data</span><span class="o">|</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">counter</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">process_datum</span><span class="p">(</span><span class="n">datum</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">use</span><span class="p">(</span><span class="n">counter</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>So what is actually happening when we compile this? The closure expression
actually compiles to a struct that implements the <code>FnMut</code> trait. This struct
will hold a reference to the <code>counter</code> variable. So in practice the desugared
form might look like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">process_count</span><span class="p">(</span><span class="n">iterator</span>: <span class="nc">impl</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Datum</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">counter</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">iterator</span><span class="p">.</span><span class="n">for_each</span><span class="p">(</span><span class="n">ClosureStruct</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">counter</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">counter</span><span class="w"> </span><span class="o">|</span><span class="p">})</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">use</span><span class="p">(</span><span class="n">counter</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The line <code>counter += 1</code> is compiled then to the equivalent of <code>*self.counter += 1</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="nb">FnMut</span><span class="o">&lt;</span><span class="n">Datum</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">ClosureStruct</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">call</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">datum</span>: <span class="nc">Datum</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">*</span><span class="bp">self</span><span class="p">.</span><span class="n">counter</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">process_datum</span><span class="p">(</span><span class="n">datum</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="converting-the-example-to-use-stream">Converting the example to use stream</h3>
<p>So what would happen if we were using an async closure? The
<code>ClosureStruct</code> would still be constructed, presumably, in the same
way. But the closure trait no longer directly performs the
action. Instead, when you call the closure, you get back a <em>future</em>
the performs the action; that <em>future</em> is going to need to have a
reference to <code>counter</code> too, and that comes from <code>self</code>. So that means
that the type of this future is going to have to hold a reference to
<code>self</code>, which means that the impl would have to look something like
this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">AsyncFnMut</span><span class="o">&lt;</span><span class="n">Datum</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">ClosureStruct</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Future</span><span class="o">&lt;</span><span class="na">&#39;s</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ClosureFuture</span><span class="o">&lt;</span><span class="na">&#39;s</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">call</span><span class="o">&lt;</span><span class="na">&#39;s</span><span class="o">&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="na">&#39;s</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">datum</span>: <span class="nc">Datum</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">ClosureFuture</span><span class="o">&lt;</span><span class="na">&#39;s</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">ClosureFuture</span>::<span class="n">new</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">counter</span><span class="p">,</span><span class="w"> </span><span class="n">datum</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>As you can see, modeling this properly requires GATs. In fact, async
closures are basically <a href="http://smallcultfollowing.com/babysteps/blog/2019/12/10/async-interview-2-cramertj-part-2/#terminology-note-detachedattached-instead-of-streaming">&ldquo;attached&rdquo;</a> closures which return a value that
borrows from <code>self</code>. (And, just as attached iterators might sometimes
be useful, I&rsquo;ve found that sometimes I have need of an attached
closure in synchronous code as well.)</p>
<h3 id="what-you-can-write-today">What you can write today</h3>
<p>The only thing you can write today is a closure that returns an async
block:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">foo</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="k">async</span><span class="w"> </span><span class="k">move</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">};</span><span class="w">
</span></span></span></code></pre></div><p>But this has rather different semantics. In this case, for example, we
would be copying the current value of <code>counter</code> into the future, and
not holding a reference to the <code>counter</code> (and if you tried to hold a
reference, you&rsquo;ll get an error).</p>
<h3 id="conclusion">Conclusion</h3>
<p>This wraps up my 3-part summary of my conversation with cramertj.
Looking back, I think the main take-aways are:</p>
<ul>
<li>We could stabilize <a href="https://docs.rs/futures/0.3.1/futures/io/trait.AsyncRead.html"><code>AsyncRead</code></a> and <a href="https://docs.rs/futures/0.3.1/futures/io/trait.AsyncWrite.html"><code>AsyncWrite</code></a> and resolve the
questions of uninitialized memory (and presumably vectorized writes,
which we didn&rsquo;t discuss explicitly) in some analogous way with the
sync version of the traits.</li>
<li><a href="https://docs.rs/futures-core/0.3.1/futures_core/stream/trait.Stream.html"><code>Stream</code></a> and async closures would benefit from being &ldquo;attached&rdquo;,
which requires us to make progress on GATs.
<ul>
<li>In particular, we would not want to add generator syntax until
we have a convincing and complete story.</li>
</ul>
</li>
<li>Similarly, until the async closures story is more complete, we
probably want to hold off on adding too many utility functions in
the stdlib. Auxiliary libraries like <a href="https://github.com/rust-lang-nursery/futures-rs/"><code>futures</code></a> allow us to
introduce such functions and later make changes.</li>
<li>The <code>select!</code> macro is cool and everybody should read the
<a href="https://rust-lang.github.io/async-book/06_multiple_futures/03_select.html">async book chapter</a> to learn why. =)</li>
</ul>
<h2 id="comments">Comments?</h2>
<p>There is a <a href="https://users.rust-lang.org/t/async-interviews/35167/">thread on the Rust users forum</a> for this series.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/asyncinterviews" term="asyncinterviews" label="AsyncInterviews"/></entry><entry><title type="html">Async Interview #2: cramertj, part 2</title><link href="https://smallcultfollowing.com/babysteps/blog/2019/12/10/async-interview-2-cramertj-part-2/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2019/12/10/async-interview-2-cramertj-part-2/</id><published>2019-12-10T00:00:00+00:00</published><updated>2019-12-10T00:00:00+00:00</updated><content type="html"><![CDATA[<p>This blog post is continuing <a href="http://smallcultfollowing.com/babysteps/blog/2019/12/09/async-interview-2-cramertj/">my conversation with cramertj</a>.</p>
<p>In the first post, I covered what we said about Fuchsia,
interoperability, and the organization of the futures crate.  This
post covers cramertj&rsquo;s take on the <a href="https://docs.rs/futures-core/0.3.1/futures_core/stream/trait.Stream.html"><code>Stream</code></a> trait as well as the
<a href="https://docs.rs/futures/0.3.1/futures/io/trait.AsyncRead.html"><code>AsyncRead</code></a> and <a href="https://docs.rs/futures/0.3.1/futures/io/trait.AsyncWrite.html"><code>AsyncWrite</code></a> traits.</p>
<p>You can watch the <a href="https://youtu.be/NF_qyiypnOs">video</a> on YouTube.</p>
<h3 id="the-need-for-streaming-streams-and-iterators">The need for &ldquo;streaming&rdquo; streams and iterators</h3>
<p>Next, cramertj and I turned to discussing some of the specific traits
from the futures crate. One of the traits that we covered was
<a href="https://docs.rs/futures-core/0.3.1/futures_core/stream/trait.Stream.html"><code>Stream</code></a>. The <a href="https://docs.rs/futures-core/0.3.1/futures_core/stream/trait.Stream.html"><code>Stream</code></a> trait is basically the asynchronous version
of the <a href="https://doc.rust-lang.org/std/iter/trait.Iterator.html"><code>Iterator</code></a> trait. In (slightly) simplified form, it is as
follows:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">trait</span><span class="w"> </span><span class="n">Stream</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">poll_next</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span>: <span class="nc">Pin</span><span class="o">&lt;&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">Self</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">cx</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">Context</span><span class="o">&lt;</span><span class="nb">&#39;_</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Poll</span><span class="o">&lt;</span><span class="nb">Option</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The main concern that cramertj raised with this trait is that, like
<code>Iterator</code>, it always gives ownership of each item back to its
caller. This falls out from its structure, which requires the
implementor to specify an <code>Item</code> type, and that <code>Item</code> type cannot
borrow from the <code>self</code> reference given to <code>poll_next</code>.</p>
<p>In practice, many stream/iterator implementations would be more
efficient if they could have some internal storage that they re-use
over and over. For example, they might have an internal buffer, and
when <code>poll_next</code> is called, they would give back (upon completion) a
<strong>reference</strong> to that buffer. The idea would be that once <code>poll_next</code>
is called again, they would start to re-use the same buffer.</p>
<h3 id="terminology-note-detachedattached-instead-of-streaming">Terminology note: Detached/attached instead of &ldquo;streaming&rdquo;</h3>
<p>The idea of having an iterator that re-uses an internal buffer has
come up before. In that context, it was often called a &ldquo;streaming
iterator&rdquo;, which I guess means that we want a &ldquo;streaming stream&rdquo;.
This is pretty clearly a suboptimal term.</p>
<p>In the call, I mentioned the term &ldquo;detached&rdquo;, which I sometimes use to
refer to the current <code>Iterator</code>/<code>Stream</code>.  The idea is that <code>Item</code>
that gets returned by <code>Stream</code> is &ldquo;detached&rdquo; from <code>self</code>, which means
that it can be stored and moved about independently from <code>self</code>. In
contrast, in a &ldquo;streaming stream&rdquo; design, the return value may be
borrowed from <code>self</code>, and hence is &ldquo;attached&rdquo; &ndash; it can only be used
so long as the <code>self</code> reference remains live.</p>
<p>I&rsquo;m not really sure that I care for this terminology. I sort of prefer
&ldquo;owned/borrowing iterator&rdquo;, where the idea is in an owned iterator,
the iterator transfers ownership of the data to you, and in borrowing
iterator, the data you get back is borrowed from the iterator
itself. However, I fear that these terms will be confused for the
distinction between <code>vec.into_iter()</code> and <code>vec.iter()</code>. Both of these
methods exist today, of course, and they both yield &ldquo;detached&rdquo;
iterators; however, the former takes ownership of <code>vec</code> and the latter
borrows from it. The key point is that <code>vec.iter()</code> is giving back
borrowed values, but they are borrowed <em>from the vector</em>, not from the
<em>iterator</em>.</p>
<p>(One final note is that this same concept of &lsquo;attached&rsquo; vs &lsquo;detached&rsquo;
will come up when discussing async closures again, which further
argues for using terminology other than &ldquo;streaming&rdquo;.)</p>
<h3 id="the-natural-way-to-write-attached-streams-is-with-gats">The natural way to write &ldquo;attached&rdquo; streams is with GATs</h3>
<p>In any case, the challenge here is that, without generic associated
types, there is no nice way to write the &ldquo;attached&rdquo; (or &ldquo;streaming&rdquo;)
version of <code>Stream</code>. You really want to be able to write a definition
like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">AttachedStream</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="o">&lt;</span><span class="na">&#39;s</span><span class="o">&gt;</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="bp">Self</span>: <span class="na">&#39;s</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//       ^^^^ ^^^^^^^^^^^^^^ (we likely need an annotation like this
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//       |                    too, for reasons I&#39;ll cover in an appendix)
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//       note the `&#39;s` here!
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">poll_next</span><span class="o">&lt;</span><span class="na">&#39;s</span><span class="o">&gt;</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span>: <span class="nc">Pin</span><span class="o">&lt;&amp;</span><span class="na">&#39;s</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="bp">Self</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">cx</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">Context</span><span class="o">&lt;</span><span class="nb">&#39;_</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Poll</span><span class="o">&lt;</span><span class="nb">Option</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&lt;</span><span class="na">&#39;s</span><span class="o">&gt;&gt;&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//                         ^^^^
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// `&#39;s` is the lifetime of the `self` reference.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Thus, the `Item` that gets returned may
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// borrow from `self`.
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="attached-streams-would-be-used-differently-than-the-current-ones">&ldquo;Attached&rdquo; streams would be used differently than the current ones</h3>
<p>There are real implications to adopting an &ldquo;attached&rdquo; definition of
stream or iterator.  In short, particularly in a generic context where
you don&rsquo;t know all the types involved, you wouldn&rsquo;t be able to get
back two values from an &ldquo;attached&rdquo; stream/iterator at the same time,
whereas you can with the &ldquo;detached&rdquo; streams and iterators we have
today.</p>
<p>For the most common use case of iterating over each element in turn,
this doesn&rsquo;t matter, but it&rsquo;s easy to define functions that rely on
it. Let me illustrate with <code>Iterator</code> since it&rsquo;s easier. Today, this
code compiles:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="sd">/// Returns the next two elements in the iterator.
</span></span></span><span class="line"><span class="cl"><span class="sd">/// Panics if the iterator doesn&#39;t have at least two elements.
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">first_two</span><span class="o">&lt;</span><span class="n">I</span><span class="o">&gt;</span><span class="p">(</span><span class="n">iterator</span>: <span class="nc">I</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="p">(</span><span class="n">I</span>::<span class="n">Item</span><span class="p">,</span><span class="w"> </span><span class="n">I</span>::<span class="n">Item</span><span class="p">)</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">I</span>: <span class="nb">Iterator</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">first_item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">iterator</span><span class="p">.</span><span class="n">next</span><span class="p">().</span><span class="n">unwrap</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">second_item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">iterator</span><span class="p">.</span><span class="n">next</span><span class="p">().</span><span class="n">unwrap</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">(</span><span class="n">first_item</span><span class="p">,</span><span class="w"> </span><span class="n">second_item</span><span class="p">)</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>However, given an &ldquo;attached&rdquo; iterator design, the first call to <code>next</code>
would &ldquo;borrow&rdquo; <code>iterator</code>, and hence you could not call <code>next()</code> again
so long as <code>first_item</code> is still in use.</p>
<h3 id="concerns-with-blocking-the-streaming-trait">Concerns with blocking the streaming trait</h3>
<p>If I may editorialize a bit, in re-watching the video, I had a few thoughts:</p>
<p>First, I don&rsquo;t want to block a stable <code>Stream</code> on generic associated
types. I do think we should prioritize shipping GATs and I would
expect to see progress nex year, but I think we need <em>some</em> form of
<code>Stream</code> sooner than that.</p>
<p>Second, the existing <code>Stream</code> is very analogous to
<code>Iterator</code>. Moreover, there has been a long-standing desire for
attached iterators. Therefore, it seems reasonable to move forward
with stabilizing stream today, and then expect to revisit both traits
in a consistent fashion once generic associated types are available.</p>
<h3 id="detached-streams-can-be-converted-into-attached-ones">&ldquo;Detached&rdquo; streams can be converted into &ldquo;attached&rdquo; ones</h3>
<p>Let&rsquo;s assume then that we choose to stabilize <code>Stream</code> as it exists
today. Then we may want to add an <code>AttachedStream</code> later on.  In
principle, it should then be possible to add a &ldquo;conversion&rdquo; trait such
that anything which implements <code>Steam</code> also implements
<code>AttachedStream</code>:</p>
<pre tabindex="0"><code>impl&lt;S&gt; AttachedStream for S
where
    S: Stream,
{
    type Item&lt;&#39;_&gt; = S::Item;
    
    fn poll_next&lt;&#39;s&gt;(
        self: Pin&lt;&amp;&#39;s mut Self&gt;,
        cx: &amp;mut Context&lt;&#39;_&gt;,
    ) -&gt; Poll&lt;Option&lt;Self::Item&lt;&#39;s&gt;&gt;&gt; {
        Stream::poll_next(self, cx)
    }
}
</code></pre><p>The idea here is that the <code>AttachedStream</code> trait gives the
<em>possibility</em> of returning values that borrow from <code>self</code>, but it
doesn&rsquo;t <em>require</em> that the returned values do so.</p>
<p>As far as I know, the above scheme above would work. In general,
interconversion traits like these sometimes are tricky around
coherence, but you can typically get away with &ldquo;one&rdquo; such impl. It
would mean that types can implement <code>AttachedStream</code> if they need to
re-use an internal buffer and <code>Stream</code> if they do not, which is a
reasonable design. (I&rsquo;d be curious to know if there are fatal flaws
here.)</p>
<h3 id="things-that-consume-streams-would-typically-want-an-attached-stream">Things that consume streams would typically want an attached stream</h3>
<p>One downside of adding <code>Stream</code> now and <code>AttachedStream</code> later is that
functions which <em>consume</em> streams would at first all be written to work with <code>Stream</code>,
when in fact they probably would later want to be rewritten to take <code>AttachedStream</code>.
In other words, given some code like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">consume_stream</span><span class="p">(</span><span class="n">s</span>: <span class="nc">impl</span><span class="w"> </span><span class="n">Stream</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>it is quite likely that the signature should be <code>impl AttachedStream</code>. The idea is that you only want to &ldquo;consume&rdquo; a stream
if you need to have two items from the stream existing at the same
time. Otherwise, if you&rsquo;re jus going to iterate over the stream one
element at a time, attached stream is the more general variant.</p>
<h3 id="syntactic-support-for-streams-and-iterators">Syntactic support for streams and iterators</h3>
<p>cramertj and I didn&rsquo;t talk <em>too</em> much about it directly, but there
have been discussion about adding two forms of syntactic support for
streams/iterators. The first would be to extend the for loop so that
it works over streams as well, as boats covers in their blog post on
<a href="https://boats.gitlab.io/blog/post/for-await-i/">for await loops</a>.</p>
<p>The second would be to add a new form of &ldquo;generator&rdquo;, as found in many
other languages. The idea would be to introduce a new form of
function, written <code>gen fn</code> in synchronous code and <code>async gen fn</code> in
asynchronous code, that can contain <code>yield</code> statements. Calling such a
function would yield an <code>impl Iterator</code> or <code>impl Stream</code>, for sync and
async respectively.</p>
<p>One point that cramertj made is that we should hold off on adding
syntactic support until we have some form of &ldquo;attached&rdquo; stream trait
&ndash; or at least until we have a fairly clear idea what its design will
be. The idea is that we would likely want (e.g.) a for-await sugar to
operate over both detached and attached streams, and similarly we may
want <code>gen fn</code> to generate attached streams, or to have the ability to
do so.</p>
<p>In fact, generators give a nice way to get an intuitive understanding
of the difference between &ldquo;attached&rdquo; and &ldquo;detached&rdquo; streams: given
attached streams, a generator yield could return references to local
variables.  But if we only have detached streams, as today, then you
could only yield things that you own or things that were borrowed from
your caller (i.e., references derived from other references that you
got as parameters). In other words, yield would have the same
limitations as return does today.</p>
<h3 id="the-asyncread-and-asyncwrite-traits">The <code>AsyncRead</code> and <code>AsyncWrite</code> traits</h3>
<p>Next cramertj and I discussed the <a href="https://docs.rs/futures/0.3.1/futures/io/trait.AsyncRead.html"><code>AsyncRead</code></a> and <a href="https://docs.rs/futures/0.3.1/futures/io/trait.AsyncWrite.html"><code>AsyncWrite</code></a>
traits.  As currently defined in <a href="https://crates.io/crates/futures-io"><code>futures-io</code></a>, these traits are the
&ldquo;async analog&rdquo; of the corresponding synchronous traits <a href="https://doc.rust-lang.org/std/io/trait.Read.html"><code>Read</code></a> and
<a href="https://doc.rust-lang.org/std/io/trait.Write.html"><code>Write</code></a>. For example, somewhat simplified, <a href="https://docs.rs/futures/0.3.1/futures/io/trait.AsyncRead.html"><code>AsyncRead</code></a> looks like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">AsyncRead</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">poll_read</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span>: <span class="nc">Pin</span><span class="o">&lt;&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">Self</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">cx</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">Context</span><span class="o">&lt;</span><span class="nb">&#39;_</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">buf</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="p">[</span><span class="kt">u8</span><span class="p">],</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Poll</span><span class="o">&lt;</span><span class="nb">Result</span><span class="o">&lt;</span><span class="kt">usize</span><span class="p">,</span><span class="w"> </span><span class="n">Error</span><span class="o">&gt;&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>These have been a topic of recent discussion because the tokio crate
has been <a href="https://github.com/tokio-rs/tokio/pull/1744">considering adopting a new definition of
<code>AsyncRead</code>/<code>AsyncWrite</code></a>. The primary concern has to do
with the <code>buf: &amp;mut [u8]</code> method. This method is supplying a buffer
where the data should be written. Therefore, typically, it doesn&rsquo;t
really matter what the contents of that buffer when the function is
called, as it will simply be overwritten with the data
generated. <em>However,</em> it is of course <em>possible</em> to write a
<code>AsyncRead</code> implementation that does read from that buffer. This means
that you can&rsquo;t supply a buffer of uninitialized bytes, since reading
from uninitialized memory is undefined behavior and can cause LLVM to
perform mis-optimizations.</p>
<p>cramertj and I didn&rsquo;t go too far into discussing the alternatives here
so I won&rsquo;t either (this blog post is already long enough). I hope to
dig into it in future interviews. The main point that cramertj made is
that the same issue affects the standard <code>Read</code> trait and that it
would make sense to address the design in the same way in both traits.
(Indeed, there have been attempts to modify the trait to deal with
(e.g., the <a href="https://doc.rust-lang.org/std/io/trait.Read.html#method.initializer"><code>initializer</code></a> method, which also has an
<a href="https://docs.rs/futures/0.3.1/futures/io/trait.AsyncRead.html#method.initializer">analogue in the <code>AsyncRead</code> trait</a>).)</p>
<p>cramertj&rsquo;s preferred solution to the problem would be to have some
&ldquo;freeze&rdquo; function that can take uninitialized memory and &ldquo;bless&rdquo; it
such that it can be accessed without UB, though it would contain
&ldquo;random&rdquo; bytes (this is basically what people intuitively expected
from uninitialized memory, though in fact it is <a href="https://www.ralfj.de/blog/2019/07/14/uninit.html">not an accurate
model</a>). Unfortunately, figuring out how to implement such a
thing in LLVM is a pretty open question, and there are also other
problems (such as linux&rsquo;s <code>MADV_FREE</code> feature) that may make this
infeasible.</p>
<p><strong>EDIT:</strong> An earlier draft of this post mistakely said that we would
want some &ldquo;poison&rdquo; function, but really the proper term is &ldquo;freeze&rdquo;.
In other words, some function that &ndash; given a bit of uninitialized
data &ndash; makes it initialized but with some arbitrary value.</p>
<h3 id="conclusion">Conclusion</h3>
<p>This was part two of my conversation with cramertj. Stay tuned for
part 3, where we talk about async closures!</p>
<h2 id="comments">Comments?</h2>
<p>There is a <a href="https://users.rust-lang.org/t/async-interviews/35167/">thread on the Rust users forum</a> for this series.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/asyncinterviews" term="asyncinterviews" label="AsyncInterviews"/></entry><entry><title type="html">Async Interview #2: cramertj</title><link href="https://smallcultfollowing.com/babysteps/blog/2019/12/09/async-interview-2-cramertj/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2019/12/09/async-interview-2-cramertj/</id><published>2019-12-09T00:00:00+00:00</published><updated>2019-12-09T00:00:00+00:00</updated><content type="html"><![CDATA[<p>For the second <a href="http://smallcultfollowing.com/babysteps/blog/2019/12/09/async-interview-2-cramertj/">async interview</a>, I spoke with Taylor Cramer &ndash; or
cramertj, as I&rsquo;ll refer to him. cramertj is a member of the compiler
and lang teams and was &ndash; until recently &ndash; working on Fuchsia at
Google. He&rsquo;s been a key player in Rust&rsquo;s Async I/O design and in the
discussions around it. He was also responsible for a lot of the
implementation work to make <code>async fn</code> a reality.</p>
<h3 id="video">Video</h3>
<p>You can watch the <a href="https://youtu.be/NF_qyiypnOs">video</a> on YouTube. I&rsquo;ve also embedded a copy here
for your convenience:</p>
<center><iframe width="560" height="315" src="https://www.youtube.com/embed/NF_qyiypnOs" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe></center>
<h3 id="spreading-this-out-over-a-few-posts">Spreading this out over a few posts</h3>
<p>So, cramertj and I had a long conversation, with a lot of technical
detail. I was trying to get this blog post finished by last Friday but
it took a lot of time! I decided it&rsquo;s probably too much material to
post in one go, so I&rsquo;m going to break up the blog post into a few
pieces (I&rsquo;ll post the whole video though).</p>
<p>The blog post is mostly covering what cramertj had to say, though in
some cases I&rsquo;m also adding in various bits of background information
or my own editorialization. I&rsquo;m trying to mark it when I do that. =)</p>
<h3 id="on-fuchsia">On Fuchsia</h3>
<p>We kicked off the discussion talking a bit about the particulars of
the Fuchsia project. Fuchsia is a microkernel architecture and thus a
lot of the services one finds in a typical kernel are implemented as
independent Fuchsia processes. These processes are implemented in Rust
and use Async I/O.</p>
<h3 id="fuchsia-uses-its-own-unique-executor-and-runtime">Fuchsia uses its own unique executor and runtime</h3>
<p>Because Fuchsia is not a unix system, its kernel primitives, like
sockets and events, work quite differently. Fuchsia therefore uses its
own custom executor and runtime, rather than building on a separate
stack like tokio or async-std.</p>
<h3 id="fuchsia-benefits-from-interoperability">Fuchsia benefits from interoperability</h3>
<p>Even though Fuchsia uses its own executor, it is able to reuse a lot
of libraries from the ecosystem. For example, Fuchsia uses Hyper for
its HTTP parsing. This is possible because Hyper offers a generic
interface based on traits that Fuchsia can implement.</p>
<p>In general, cramertj feels that the best way to achieve interop is to
offer trait-based interfaces. There are other projects, for example,
that offer feature flags (e.g., to enable &ldquo;tokio&rdquo; compatibilty etc),
but this tends to be a suboptimal way of managing things, at least for
libraries.</p>
<p>For one thing, offer features means that support for systems like fuschia
must be &ldquo;upstreamed&rdquo; into the project, whereas offering traits means that
downsteam systems can implement the traits themselves.</p>
<p>In addition, using features to choose between alternatives can cause
problems across larger dependency graphs. Features are always meant to
be &ldquo;additive&rdquo; &ndash; i.e,. you can add any number of them &ndash; but features
that choose between backends tend to be exclusive &ndash; i.e., you must
choose at most one. This is a problem because cargo likes to take the
union of all features across a dependency graph, and so having
exclusive features can lead to miscompilations when things are
combined.</p>
<h3 id="background-topic-futures-crate">Background topic: futures crate</h3>
<p>cramertj and I next talked some about the futures crate. Before going
much further into that, I want to give a bit of background on the
futures crate itself and how its setup.</p>
<p>The futures crate has been very carefully setup to permit its
components to evolve with minimal breakage and incompatibility across
the ecosystem.  However, my experience from talking to people has been
that there is a lot of confusion as to how the futures crate is setup
and why, and just how much they can rely on things not to change. So I
want to spend a bit of time documenting <em>my</em> understanding the setup
and its motivations.</p>
<p>Historically, the <a href="https://github.com/rust-lang-nursery/futures-rs/"><code>futures</code></a> crate has served as a kind of
experimental &ldquo;proving ground&rdquo; for various aspects of the future
design, including the <code>Future</code> trait itself (which is now in std).</p>
<p>Currently, the futures crate is at version 0.3, and it offers a number of
different categories of functionality:</p>
<ul>
<li>key traits like <a href="https://docs.rs/futures-core/0.3.1/futures_core/stream/trait.Stream.html"><code>Stream</code></a>, <a href="https://docs.rs/futures/0.3.1/futures/io/trait.AsyncRead.html"><code>AsyncRead</code></a>, and <a href="https://docs.rs/futures/0.3.1/futures/io/trait.AsyncWrite.html"><code>AsyncWrite</code></a></li>
<li>key primitives like [&ldquo;async-aware&rdquo; locks]
<ul>
<li>traditional locks</li>
</ul>
</li>
<li>&ldquo;extension&rdquo; traits like <a href="https://docs.rs/futures/0.3.1/futures/future/trait.FutureExt.html"><code>FutureExt</code></a>, <a href="https://docs.rs/futures/0.3.1/futures/stream/trait.StreamExt.html"><code>StreamExt</code></a>, <a href="https://docs.rs/futures/0.3.1/futures/io/trait.AsyncReadExt.html"><code>AsyncReadExt</code></a>, and so forth
<ul>
<li>these traits offer convenient combinator methods like <code>map</code> that
are not part of the corresponding base traits</li>
</ul>
</li>
<li>useful macros like <a href="https://docs.rs/futures/0.3.1/futures/macro.join.html"><code>join!</code></a> or <a href="https://docs.rs/futures/0.3.1/futures/macro.select.html"><code>select!</code></a></li>
<li>useful bits of code such as a <a href="https://docs.rs/futures/0.3.1/futures/executor/struct.ThreadPool.html"><code>ThreadPool</code></a> for &ldquo;off-loading&rdquo; heavy computations</li>
</ul>
<p>In fact, the first item in that list (&ldquo;key traits&rdquo;) is quite distinct
from the remaining items. In particular, if you are writing a library,
those key traits are things that you might well like to have in your
public interface. For example, if you are writing a parser that
operates on a stream of data, it might take a <a href="https://docs.rs/futures/0.3.1/futures/io/trait.AsyncRead.html"><code>AsyncRead</code></a> as its
data source (just as a synchronous parser would take a <a href="https://doc.rust-lang.org/std/io/trait.Read.html"><code>Read</code></a>).</p>
<p>The remaining items on the list fall <em>generally</em> into the category of
&ldquo;implementation details&rdquo;. They ought to be &ldquo;private&rdquo; dependencies of
your crate.  For example, you may use methods from <a href="https://docs.rs/futures/0.3.1/futures/future/trait.FutureExt.html"><code>FutureExt</code></a>
internally, but you don&rsquo;t require other crates to use them; similarly
you may <a href="https://docs.rs/futures/0.3.1/futures/macro.join.html"><code>join!</code></a> futures internally, but that is not something that
would show up in a function signature.</p>
<h3 id="the-futures-crate-is-really-a-facade">the futures crate is really a facade</h3>
<p>One thing you&rsquo;ll notice if you look more closely at the <a href="https://github.com/rust-lang-nursery/futures-rs/"><code>futures</code></a>
crate is that it is in fact composed of a number of smaller crates.
The <a href="https://github.com/rust-lang-nursery/futures-rs/"><code>futures</code></a> crate itself simply &rsquo;re-exports&rsquo; items from these
other crates:</p>
<ul>
<li><a href="https://crates.io/crates/futures-core"><code>futures-core</code></a> &ndash; defines the <a href="https://docs.rs/futures-core/0.3.1/futures_core/stream/trait.Stream.html"><code>Stream</code></a> trait (also the <code>Future</code>
trait, but that is an alias for std)</li>
<li><a href="https://crates.io/crates/futures-io"><code>futures-io</code></a> &ndash; defines the <a href="https://docs.rs/futures/0.3.1/futures/io/trait.AsyncRead.html"><code>AsyncRead</code></a> and <a href="https://docs.rs/futures/0.3.1/futures/io/trait.AsyncWrite.html"><code>AsyncWrite</code></a> traits</li>
<li><a href="https://crates.io/crates/futures-util"><code>futures-util</code></a> &ndash; defines extension traits like <a href="https://docs.rs/futures/0.3.1/futures/future/trait.FutureExt.html"><code>FutureExt</code></a></li>
<li>&hellip;</li>
</ul>
<p>The goal of this facade is to permit things to evolve without forcing
semver-incompatible changes. For example, if the <a href="https://docs.rs/futures/0.3.1/futures/io/trait.AsyncRead.html"><code>AsyncRead</code></a> trait
should evolve, we might be forced to issue a new major version of
<a href="https://crates.io/crates/futures-io"><code>futures-io</code></a> and thus ultimately issue a new <a href="https://github.com/rust-lang-nursery/futures-rs/"><code>futures</code></a> release
(say, 0.4). However, the version number of <a href="https://crates.io/crates/futures-core"><code>futures-core</code></a> remains
unchanged. This means that if your crate only depends on the
<a href="https://docs.rs/futures-core/0.3.1/futures_core/stream/trait.Stream.html"><code>Stream</code></a> trait, it will be interoperable across both <a href="https://github.com/rust-lang-nursery/futures-rs/"><code>futures</code></a> 0.3
and 0.4, since both of those versions are in fact re-exporting the
same <a href="https://docs.rs/futures-core/0.3.1/futures_core/stream/trait.Stream.html"><code>Stream</code></a> trait (from <a href="https://crates.io/crates/futures-core"><code>futures-core</code></a>, whose version has not
changed).</p>
<p>In fact, if you are a library crate, it probably behooves you to avoid
depending on the <a href="https://github.com/rust-lang-nursery/futures-rs/"><code>futures</code></a> crate at all, and instead to declare
finer-grained dependencies; this will make it very clear whe you need
to declare a new semver release yourself.</p>
<h3 id="cramertj-the-best-place-for-standard-traits-is-in-std">cramertj: the best place for &ldquo;standard&rdquo; traits is in std</h3>
<p>So, background aside, let me return to my discussion with
cramertj. One of the points that cramertj is that the only &ldquo;truly
standard&rdquo; place for a trait to live is libstd. Therefore, cramertj
feels like the next logical step for traits like <a href="https://docs.rs/futures-core/0.3.1/futures_core/stream/trait.Stream.html"><code>Stream</code></a> or
<a href="https://docs.rs/futures/0.3.1/futures/io/trait.AsyncRead.html"><code>AsyncRead</code></a> is to start moving them into the standard library.  Once
they are there, this would be the strongest possible signal that
people can rely on them not to change.</p>
<h3 id="we-can-move-to-libstd-without-breakage">we can move to libstd without breakage</h3>
<p>You may be wondering what it would mean if we moved one of the traits
from the <a href="https://github.com/rust-lang-nursery/futures-rs/"><code>futures</code></a> crate into libstd &ndash; would things in the
ecosystem that are currently using <a href="https://github.com/rust-lang-nursery/futures-rs/"><code>futures</code></a> have to update? The
answer is no, not necessarily.</p>
<p>Presuming that some trait from <a href="https://github.com/rust-lang-nursery/futures-rs/"><code>futures</code></a> is moved wholesale into
libstd (i.e., without <em>any</em> modification), then it is possible for us
to simply issue a new <em>minor version</em> of the <a href="https://github.com/rust-lang-nursery/futures-rs/"><code>futures</code></a> crate (and
the appropriate subcrate). This new minor version would change from
defining a trait (say, <code>Stream</code>) to re-exporting the version from std.</p>
<p>As a concrete example, if we moved <a href="https://docs.rs/futures/0.3.1/futures/io/trait.AsyncRead.html"><code>AsyncRead</code></a> from <a href="https://crates.io/crates/futures-io"><code>futures-io</code></a>
to libstd (as cramertj advocates for later on), then we would issue a
<code>0.3.2</code> release of <a href="https://crates.io/crates/futures-io"><code>futures-io</code></a>. This release would replace <code>trait AsyncRead</code> with a <code>pub use</code> that re-exports <a href="https://docs.rs/futures/0.3.1/futures/io/trait.AsyncRead.html"><code>AsyncRead</code></a> from std. Now,
any crate in the ecosystem that previously depended on <code>0.3.1</code> can be
transparently upgraded to <code>0.3.2</code> (it&rsquo;s a semver-compatibly change,
after all)<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>, and suddenly all references to
<a href="https://docs.rs/futures/0.3.1/futures/io/trait.AsyncRead.html"><code>AsyncRead</code></a> would be referencing the version from std. (This is, in
fact, exactly what happened with the futures trait; in 0.3.1., it is
simply <a href="https://docs.rs/futures-core/0.3.1/src/futures_core/future.rs.html#7">re-exported from libcore</a>.)</p>
<h3 id="on-the-extension-traits">on the extension traits</h3>
<p>One of the interesting points that cramertj made, though not until
later in the interview, is that when it comes to futures there are a
number of &ldquo;smaller design decisions&rdquo; one might make when it comes to
combinators. For example, consider a function like <a href="https://docs.rs/futures/0.3.1/futures/stream/trait.StreamExt.html#method.filter"><code>Stream::filter</code></a>.
As defined in the future crates, this function returns a &ldquo;future to a
boolean&rdquo;, so it has a signature like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="nb">FnMut</span><span class="p">(</span><span class="o">&amp;</span><span class="n">Item</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">impl</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kt">bool</span><span class="o">&gt;</span><span class="w">
</span></span></span></code></pre></div><p>This is effectively an async closure; I&rsquo;ll summarize what cramertj had
to say about async closures in one of the upcoming blog
posts. However, you might plausibly wish instead to have a signature
that just returns a boolean directly, like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="nb">FnMut</span><span class="p">(</span><span class="o">&amp;</span><span class="n">Item</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">bool</span>
</span></span></code></pre></div><p>For this reason, cramertj felt that it may make sense not to add these
sorts of utilities into the standard library (or at least not yet),
and instead to leave those extension traits in &ldquo;user space&rdquo;. Maybe when we have
more experience we&rsquo;ll be able to say what the best definition would be for
the standard library.</p>
<p>(If I may editorialize, I do think it&rsquo;s important that we add these
sorts of helper methods to std eventually; even if there&rsquo;s no single
best choice, we should make some decisions, because it&rsquo;ll be quite
annoying to force everything to pull in utility crates for simple
things.)</p>
<h3 id="upcoming-posts">upcoming posts</h3>
<p>OK, that wraps it up for the first post. I have two more coming. In
the next post, we&rsquo;ll discuss the design of the <a href="https://docs.rs/futures-core/0.3.1/futures_core/stream/trait.Stream.html"><code>Stream</code></a>, <a href="https://docs.rs/futures/0.3.1/futures/io/trait.AsyncRead.html"><code>AsyncRead</code></a>,
and <a href="https://docs.rs/futures/0.3.1/futures/io/trait.AsyncWrite.html"><code>AsyncWrite</code></a> traits, and what we might want to change there. In
the final post, we&rsquo;ll discuss async closures.</p>
<h2 id="comments">Comments?</h2>
<p>There is a <a href="https://users.rust-lang.org/t/async-interviews/35167/">thread on the Rust users forum</a> for this series.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>This change relies on the fact that cargo will generally not compile two distinct minor versions of a crate; so all crates that depend on <code>0.3.1</code> would be compiled against <code>0.3.2</code>.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/asyncinterviews" term="asyncinterviews" label="AsyncInterviews"/></entry><entry><title type="html">AiC: Improving the pre-RFC process</title><link href="https://smallcultfollowing.com/babysteps/blog/2019/12/03/aic-improving-the-pre-rfc-process/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2019/12/03/aic-improving-the-pre-rfc-process/</id><published>2019-12-03T00:00:00+00:00</published><updated>2019-12-03T00:00:00+00:00</updated><content type="html"><![CDATA[<p>I want to write about an idea that Josh Triplett and I have been
iterating on to revamp the lang team RFC process. I have written a
<a href="https://github.com/nikomatsakis/project-staged-rfcs/blob/master/rfcs/0001-shepherded-rfcs.md">draft</a> of an RFC already, but this blog post aims to introduce the
idea and some of the motivations. The key idea of the RFC is formalize
the steps leading <em>up</em> to an RFC, as well as to capture the lang team
operations around <strong>project groups</strong>. The hope is that, if this
process works well, it can apply to teams beyond the lang team as
well.</p>
<h3 id="tldr">TL;DR</h3>
<p>In a nutshell, the <a href="https://github.com/nikomatsakis/project-staged-rfcs/blob/master/rfcs/0001-shepherded-rfcs.md">proposal</a> is this:</p>
<ul>
<li>When you see a problem you think we should try to solve, you open an
issue on the <a href="https://github.com/rust-lang/lang-team/">lang-team</a> repository. This is called a <strong><a href="https://github.com/nikomatsakis/project-staged-rfcs/blob/master/rfcs/0001-shepherded-rfcs.md#proposal-issues">proposal issue</a></strong>.</li>
<li>In the <strong><a href="https://github.com/nikomatsakis/project-staged-rfcs/blob/master/rfcs/0001-shepherded-rfcs.md#proposal-issues">proposal issue</a></strong>, you include a description of the problem
and a link to a thread on <a href="https://internals.rust-lang.org/">internals</a> where the problem is being
discussed.
<ul>
<li>You might have a sketch of a solution in mind, but that&rsquo;s not
required. Even if there is a possible solution, we would always
expect to start by looking at different alternatives as well, to
make sure we&rsquo;re headed in the overall direction.</li>
<li>Proposals would not be expected to use the full RFC
template. The idea is to be lightweight.</li>
<li>It is important that discussion does <strong>not</strong> take place on the issue.</li>
</ul>
</li>
<li>The lang-team <a href="https://github.com/nikomatsakis/project-staged-rfcs/blob/master/rfcs/0001-shepherded-rfcs.md#reviewing-proposals">periodically reviews those issues</a>. If someone on the
team likes the idea, we will create a &ldquo;project group&rdquo; around the
design. Each project group has a repository, a <a href="https://github.com/nikomatsakis/project-staged-rfcs/blob/master/rfcs/0001-shepherded-rfcs.md#lang-team-liason">lang team liaison</a>, and
one or more <a href="https://github.com/nikomatsakis/project-staged-rfcs/blob/master/rfcs/0001-shepherded-rfcs.md#shepherds">shepherds</a>. The repository houses the draft RFC and
potentially other documents, such as design notes.</li>
<li>The project group will continue working on the idea until it is
complete, meaning that the design has been implemented and become
stable. For smaller ideas, this could go quite quickly; for larger
ideas, it might take longer. (Of course, we may also decide to
cancel the idea at some point.)</li>
</ul>
<p>Note that I did not say anything yet about the main RFCs repository.
The idea is that, when a project group feels the design is ready, they
will open the RFC on the main repository. At that point, the RFC
represents a design that has already undergone a fair amount of
iteration. Moreover, the shepherds and lang team liaison should ensure
that the lang team is getting regular updates on the
progress. <strong>Therefore, the RFC process itself should go significantly
faster.</strong></p>
<p>One of my hopes is that a lighter and faster RFC process will also
mean that we can use RFCs for smaller decisions, and not just the
final design. For example, I think it&rsquo;d be useful to write an RFC
documenting a major choice in the direction, and then have follow-up
RFCs that work out some of the details. (This is somewhat similar to
the eRFC idea that <a href="https://github.com/rust-lang/rfcs/blob/master/text/2033-experimental-coroutines.md">we used for coroutines</a> but never
formalized.)</p>
<h3 id="goal-increased-transparency">Goal: Increased transparency</h3>
<p>One of the goals here is to increase our <strong>transparency</strong> &ndash;
specifically, I want it to be easier to follow along with the design
that is taking place. I also want you to be able to control how
&ldquo;deeply&rdquo; you follow along. I think that this proposal helps in two
ways:</p>
<ul>
<li>First, the lang team will have an active list of <strong>project groups</strong>
which represent the work that is being monitored by the team. This alone
gives a good overview of what we&rsquo;re doing.</li>
<li>Each project group should also have a repository documenting their
meetings and communication channels. A well-run group will also have
links to blog posts, discussion articles, or other documents. So if
you want to dig deeper into a design, or get involved, you can do it
that way.</li>
<li>Finally, the RFC repo itself is a good way to get an overview of
&ldquo;major&rdquo; decisions that are taking place. Monitoring this repo would
be a good way for you to raise a red flag if you see something that
has been overlooked. However, since RFCs will often be the result
of a lot more iteration and design, it wouldn&rsquo;t be the best place
for smaller bikeshedding.</li>
</ul>
<p>One thing that is worth emphasizing is that RFCs in this model will
not be &rsquo;early stage&rsquo; ideas. They will be the result of a lot more
iteration. This will frequently mean that we are not looking for
&ldquo;general feedback&rdquo; so much as specific, useful criticism.</p>
<h3 id="goal-clearer-on-ramp">Goal: Clearer on-ramp</h3>
<p>Another goal is to make a clearer &ldquo;on-ramp&rdquo; for getting the lang
team&rsquo;s attention. Right now, there isn&rsquo;t really a good way to
&ldquo;propose&rdquo; an idea and bring it to the lang team&rsquo;s attention. You can
create a thread on internals, but that is not guaranteed to be
seen. You can open an RFC, but if the idea is half-baked, you will get
pushback, and if it&rsquo;s highly developed, you might find that you&rsquo;ve
been going down the wrong road.</p>
<p>I feel like this procedure offers a clearer &ldquo;invitation&rdquo; for bringing
an idea forward. I think it&rsquo;s important though that we couple it with
lang-team procedures that help us ensure that we stay on top of
meeting proposals.</p>
<h3 id="putting-this-idea-into-practice">Putting this idea into practice</h3>
<p>One question that arises with this idea is what to do with the
existing RFC PRs on the repository. If we adopt this proposal, my plan
is to encourage authors to migrate those PRs to proposal issues
instead. After some period of time, we will close the RFC PRs (except
for those that have an active project group behind them). We could
also consider an automatic migration, but I think it might be useful
to be a bit more selective.</p>
<h3 id="lang-team-practice-and-serendipity">Lang team practice and serendipity</h3>
<p>Although it is not part of the RFC proper, I think that it is also
important for the lang-team to restructure how we operate a bit. I
would like us to use project groups to expose and declare the things
we are actively working on, and I think we should devote <em>most</em> of our
time to those things. But I also think we should reserve some time for
ideas that are not on that list.</p>
<p>I have two goals here. First, sometimes there are just smaller ideas
that will never be a kind of &ldquo;top priority&rdquo; but are nonetheless nice
to have. A prime example might be a syntactic addition like <code>if let</code>.</p>
<p>Second, sometimes there are nice ideas like <a href="https://github.com/rust-lang/rfcs/pull/2580">RFC 2580</a>. These ideas
have been well developted, and it might be good to move forward, but
it&rsquo;s hard to find the time to discuss them. As a result, the RFCs hang
about in a sort of &ldquo;limbo&rdquo;, where it&rsquo;s totally unclear whether
anything will ever happen.</p>
<p>I also expect that as part of this we will impose cerain limits.  For
example, I don&rsquo;t think any one person should be shepherding or serving
as a liason for more than a few things at a time &ndash; possibly just one
if the proposal is big enough. That will put an overall cap on how
much the lang team can try to do at one time, but that seems like a
good limit. The <a href="http://smallcultfollowing.com/babysteps/blog/2019/09/11/aic-shepherds-3-0/">Shepherding 3.0</a> blog post had more notes on
this topic.</p>
<p>I am hoping that if we have a clearer meeting queue, we can put ideas
like that on the list, and at least there will be a clear time to
discuss and decide definitively whether we can indeed move forward or
not.</p>
<h3 id="conclusion">Conclusion</h3>
<p>In general, you can think of the RFC process as a kind of &ldquo;funnel&rdquo;
with a number of stages. We&rsquo;ve traditionally thought of the process as
beginning at the point where an RFC with a complete design is opened,
but of course the design process <strong>really</strong> begins much
earlier. Moreover, a single bit of design can often span multiple
RFCs, at least for complex features &ndash; moreover, at least in our
current process, we often have changes to the design that occur during
the implementation stage as well. This can sometimes be difficult to
keep up with, even for lang-team members.</p>
<p>This post describes a revision to the process that aims to &ldquo;intercept&rdquo;
proposals at an earlier stage. It also proposes to create &ldquo;project
groups&rdquo; for design work and a dedicated repository that can house
documents. For smaller designs, these groups and repositories might be
small and simple. But for larger designs, they offer a space to
include a lot more in the way of design notes and other documents.</p>
<p>Assuming we adopt this process, one of the things I think we should be
working on is developing &ldquo;best practices&rdquo; around these
repositories. For example, I think that for every non-trivial design
decision, we should be creating a <a href="http://smallcultfollowing.com/babysteps/blog/2019/04/22/aic-collaborative-summary-documents/">summary document</a> that describes
the pros/cons and the eventual decision (along with, potentially,
comments from people who disagreed with that decision outlining their
reasoning).</p>
<p>We are already starting to experiment with this sort of process.  The
<a href="https://github.com/rust-lang/project-ffi-unwind">FFI-unwind project group</a>, for example, is pursuing an attempt to
decide on the rules regarding unwinding across FFI boundaries. And, as
I noted in <a href="http://smallcultfollowing.com/babysteps/blog/2019/11/22/announcing-the-async-interviews/">my post announcing the Async Interviews</a>, I&rsquo;d like
to see us collecting design notes for new traits and features that we
propose in the async space.</p>
<p>As always, I&rsquo;d love to hear your feedback. Please leave any comments
in the <a href="https://internals.rust-lang.org/t/aic-adventures-in-consensus/9843">internals thread devoted to the &ldquo;Adventures in Consensus&rdquo;
series</a>.</p>
<h3 id="thanks">Thanks</h3>
<p>I just wanted to add a &ldquo;Thank you!&rdquo; to Josh Triplett, who co-developed
a lot of these specific ideas with me, but also Withoutboats, Yoshua
Wuyts, Centril, Steve Klabnik, and the many others that have been
discussing variants of this proposal with me over time.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/consensus" term="consensus" label="Consensus"/></entry><entry><title type="html">Rust 2020</title><link href="https://smallcultfollowing.com/babysteps/blog/2019/12/02/rust-2020/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2019/12/02/rust-2020/</id><published>2019-12-02T00:00:00+00:00</published><updated>2019-12-02T00:00:00+00:00</updated><content type="html"><![CDATA[<p>Technically speaking, it&rsquo;s past the deadline for #rust2020 posts, but
I&rsquo;m running late this year, and I&rsquo;m going to post something anyway.
In this post, I am focusing on what I see as the &ldquo;largest scale&rdquo;
issues, and not on technical initiatives. If I have time, I will try
to post a follow-up talking about some of the key technical
initiatives that I think we should focus on as well.</p>
<h1 id="tldr">TL;DR</h1>
<ul>
<li>We should do an edition, and we should plan for it now</li>
<li>The time is ripe to talk about encouraging investment from companies</li>
<li>A foundation is perhaps part of the solution, but not the whole
solution; we should encourage active participation from stakeholders</li>
<li>Organizational improvements can also encourage investment</li>
<li>Organizationally, we&rsquo;ve done a lot in 2019, and we can do more in 2020</li>
</ul>
<h1 id="we-should-think-on-longer-timescales">We should think on longer timescales</h1>
<p>One of the questions we asked this year was whether we should plan for
a Rust 2021 edition. I feel pretty strongly that the answer is yes.
There are a few reasons for this.</p>
<p>I think one of the biggest parts for me is that I think it is very
healthy for us to be planning on a 3-year timescale. The fact is that
many of our projects these days take years to bring to completion. It
is good for us to talk about roadmaps, but it is also good for us to
look to a slightly longer horizon.</p>
<p>I don&rsquo;t necessarily think this kind of &ldquo;long range&rdquo; planning should be
about specific goals and features, but more about areas of
focus. Moreover, I think we should be careful to control our ambitions
&ndash; I think for example that, in thinking about 2019, we outlined a
number of features that are far more realistic on a multi-year
timescale.</p>
<h1 id="plan-for-edition-changes-early">Plan for edition changes early</h1>
<p>Editions also, of course, give us the option to make changes we
couldn&rsquo;t otherwise make. For those who aren&rsquo;t familiar, editions let
us make &ldquo;backwards incompatible&rdquo; changes to Rust &ndash; but in a way that
keeps old code working. These changes might be something as small as
adding a keyword, or as large as the module reform we made in
Rust 2018. The beauty of editions is that, since they are opt-in at a
crate granularity, we are able to keep supporting older crates
seamlessly. This means we can improve the language gradually without
forcing the entire ecosystem to upgrade in a coordinated fashion.</p>
<p>In Rust 2018, we made a number of these sorts of &ldquo;migrations&rdquo;:</p>
<ul>
<li>We modified <code>use</code> statements to introduce the <code>use crate::foo</code> notation</li>
<li>We transitioned to the <code>dyn Trait</code> syntax</li>
<li>We introduced a few keywords</li>
</ul>
<p>Crucially, we also provided <strong>tooling</strong> to automate these migrations.
This is what made changes like the first change possible at all, since
that change affected almost every crate ever written.</p>
<p>I don&rsquo;t expect us to do anything as dramatic as changing <code>use</code>
statements in Rust 2021, but I am confident we are going to want to
make a few backwards incompatible changes. I don&rsquo;t know exactly what
they will be yet, but I do know that <strong>now</strong> is the time to start
planning them &ndash; we want to be front-loading that kind of work so that
we can have time to work on the documentation, migration tooling, and
other things that we will need.</p>
<p><a href="https://blog.yoshuawuyts.com/rust-2020/">Yosh&rsquo;s #rust2020 post</a> covered this topic quite well, I think. In
the <a href="https://blog.yoshuawuyts.com/rust-2020/#timeline">timeline section</a>, he breaks down the time available, concluding with:</p>
<blockquote>
<p>All together that leaves us with about 12 months total to plan and
prepare the next edition release, starting January 2020. This should
be enough time to successfully plan and draft a new edition, with
some slack to work with.</p>
</blockquote>
<h1 id="we-are-seeing-increased-investment-from-many-companies">We are seeing increased investment from many companies</h1>
<p>2019 marked a real turning point when it comes to companies using and
supporting Rust. I remember the time when everybody I met who used
Rust was a hobbyist. Then we started to see startups and <a href="https://www.rust-lang.org/static/pdfs/Rust-Tilde-Whitepaper.pdf">smaller
companies experimenting with Rust</a>, looking for a way to boost
their productivity when writing low-level systems code. And now we
have major companies like Microsoft, Amazon, Facebook, and Google
adopting Rust for major projects. Somewhat unexpectedly, to me anyway,
Rust has become the language of choice for a lot of Blockchain
companies.</p>
<p>This increasing adoption has also begun to translate to increased
investment in Rust itself. <a href="https://internals.rust-lang.org/t/update-on-the-ci-investigation/10056/9?u=nikomatsakis">Microsoft</a> and <a href="https://aws.amazon.com/blogs/opensource/aws-sponsorship-of-the-rust-project/">Amazon</a>, for example, are
now sponsoring the majority of Rust&rsquo;s CI costs. A big part of the
async-await development was done by developers on Google&rsquo;s Fuschia
team. And so forth.</p>
<h1 id="but-we-need-more-people-paid-to-work-on-rust">But we need more people paid to work on Rust</h1>
<p>Nonetheless, for Rust to really thrive, we need to see more people
paid for their work on Rust teams. As Erin put it in <a href="https://xampprocky.github.io/public/blog/rust-2021/">her #rust2020
post</a>,</p>
<blockquote>
<p>When 1.0 launched there was ~30 members of The Rust Programming
Language, now in 2019 we have ~200 members. This is nearly 7x the
amount of members, yet we&rsquo;ve changed very little to be able to adapt
to this growth. No where is this more evident than out of the now
200 or so members, the number that are paid for their time on Rust
is still in the single digits, and this doesn&rsquo;t look to change any
time soon.</p>
</blockquote>
<p>One thing I&rsquo;ve observed time and time again is that bigger, complex
projects really require dedicated leadership and organization &ndash; and
this often takes vast amount of time. I talked some about this in my
post on <a href="http://smallcultfollowing.com/babysteps/blog/2019/04/15/more-than-coders/">&ldquo;More than coders&rdquo;</a>. The plain fact is that this kind of
time is often unavailable on a volunteer basis.</p>
<h1 id="shifting-the-focus-from-adoption-to-investment">Shifting the focus from <em>adoption</em> to <em>investment</em></h1>
<p>In year&rsquo;s past, when we thought about companies and Rust, a big part
of the focus was on encouraging <strong>adoption</strong>. But I think at this
point it&rsquo;s time for us to start encouraging <strong>investment</strong>. There are
a lot of companies using Rust now, and the time is ripe to ask
ourselves <strong>how we can help those companies to help Rust</strong>.</p>
<p>But when we ask those questions, I want us to be careful in our
thinking. I don&rsquo;t think there&rsquo;s a single, simple answer for how to
increase investment in Rust. In fact, I don&rsquo;t even think there&rsquo;s a
single answer to <strong>what investment is</strong>. I would love, of course, to
see more people hired to work 100% on Rust. But there are so many
other ways to invest in Rust:</p>
<ul>
<li>Sponsoring Rust conferences, meetups, or other social events</li>
<li>Sponsoring employees to attend the Rust All Hands</li>
<li>Encouraging employees to spend work time working on Rust as a sort
of &ldquo;20% project&rdquo;</li>
<li>Building ecosystem libraries that everyone can use</li>
<li>Sponsoring Rust&rsquo;s CI or other infrastructure</li>
<li>Sponsoring the Rust All Hands, Increasing Rust&rsquo;s Reach, or other
Rust org initiatives direectly</li>
<li>Using contracting or grants to support the maintainers of
the Rust project or key figures in the Rust ecosystem</li>
</ul>
<h1 id="many-are-stronger-than-one">Many are stronger than one</h1>
<p>Even though Rust was started by Mozilla, Mozilla never wanted to &ldquo;own&rdquo;
Rust. <strong>We&rsquo;ve always wanted Rust to have its own identity and to be
supported by many companies and groups, big and small.</strong>
Fundamentally, this is because having many stakeholders makes for a
better, more robust language. Part of what accounts for Rust&rsquo;s success
is that we&rsquo;ve attracted a diverse set of contributors, who were able
to push us to improve the design in any number of ways.</p>
<p>I mention this because, as we talk about money, I think we will also
need to address the question of whether to form a Rust foundation. I
am increasingly thinking that this is a good idea. I think that having
a central legal entity that represents Rust could solve some
challenges for us, for example, and I also think having a central bank
account could help for &ldquo;group funding&rdquo; of infrastructure, events like
the Rust All Hands, or programs likes Increasing Rust&rsquo;s Reach. But I
<strong>don&rsquo;t</strong> expect this Rust foundation to directly &ldquo;solve the problem&rdquo;
of paying people to work on Rust, nor do I think it should. I would
expect it rather to be one piece of a larger puzzle.</p>
<p>Whatever we wind up with, I think it&rsquo;s important to encourage
companies that use Rust to employ key figures that actively
participate in the Rust organization (whether that be full- or
part-time). We don&rsquo;t want a setup where the Rust organization is the
foundation, supported financially by others. We want a setup where the
Rust organization is directly composed, as much as possible, of its
users and stakeholders, all working together.</p>
<h1 id="improving-our-organization-can-lead-to-increased-investment">Improving our organization can lead to increased investment</h1>
<p>I think a lot of the changes we need may be more <strong>organizational</strong>
than specifically to do with money. For example, I enjoyed reading
<a href="https://www.parity.io/rust-2020/">Parity&rsquo;s #rust2020 post</a>, and I was particularly struck by
this paragraph (emphasis mine):</p>
<blockquote>
<p>For many of the issues raised above, we are also happy to jump in
and help out–and on other issues as well. We are a Rust company
after all—we believe in the language, its ecosystem and the
community, and want to be a valuable participant in it. &hellip; However,
it’s often unclear whether the work is worthwhile. <strong>To a business, it
is hard to argue that one might spend a month or two working on a
new feature without any assurance that the approach taken would be
accepted.</strong></p>
</blockquote>
<p>This is something that&rsquo;s been on my mind quite a bit lately as well.
<strong>If you are a company or organization that would like to help make
changes to Rust, how do you go about it?</strong> I&rsquo;ve been getting this
question more and more lately as I go and talk to
companies. Sometimes, the question pertains to a single feature, like
custom test frameworks, or custom allocators.  Other times, the
question is about a broader initiative &ndash; think of the <a href="https://ferrous-systems.com/blog/sealed-rust-the-pitch/">Sealed Rust</a>
pitch that Ferrous Systems posted some time back.</p>
<p>In principle, the RFC process is supposed to help serve these needs,
but I don&rsquo;t think in practice it&rsquo;s working very well. I think though
that we can tweak and improve our system to overcome some of those
shortcomings. What&rsquo;s more, if we do, I think that same system won&rsquo;t be
specific to <strong>companies</strong>. After all, if you&rsquo;re a volunteer
contributor interested in pushing on a specific feature, you face the
same the problem. (This, for example, is precisely the problem that
<a href="http://smallcultfollowing.com/babysteps/blog/2019/09/11/aic-shepherds-3-0/">shepherding</a> is taking aim at.)</p>
<h1 id="2019-saw-a-lot-of-progress-in-organizational-matters">2019 saw a lot of progress in organizational matters</h1>
<p>Organizationally, I&rsquo;m quite proud of all the work that we did during
2019, even though I still think we&rsquo;ve got a lot of room to go. Just
looking at the compiler team, for example, we really refined the
concept of working groups, we clarified the concept of <a href="https://rust-lang.github.io/rfcs/2689-compiler-team-contributors.html">compiler team
contributors</a>, and we introduced other innovations like the
<a href="https://rust-lang.github.io/compiler-team/about/steering-meeting/">weekly design meetings</a>. These meetings have meant not only that we
just have a lot more communication as a team, they&rsquo;re also great for
people looking to eavesdrop and learn more about how the compiler
works. The lang team is <a href="https://github.com/rust-lang/lang-team/tree/master/minutes">publishing its minutes</a> and
(frequently) recordings of our meetings, which are also open for
anyone to attend; the core team is also <a href="https://github.com/rust-lang/core-team/blob/master/minutes/core-team/meetings.md">publishing recordings</a> on
a best effort basis. The intrastructure team has made great strides in
documenting their procedures on <a href="https://github.com/rust-lang/core-team/tree/master/minutes/project-leadership-sync">forge</a>, as have other
teams. At the project level, we introduced the <a href="https://github.com/rust-lang/core-team/tree/master/minutes/project-leadership-sync">leadership sync
meeting</a>, the <a href="https://blog.rust-lang.org/inside-rust/">Inside Rust blog</a>, and we&rsquo;ve been trying to
get a <a href="https://blog.rust-lang.org/inside-rust/2019/11/13/goverance-wg-cfp.html">governance-focused WG</a> off the ground.</p>
<h1 id="we-can-do-even-more-in-2020">We can do even more in 2020</h1>
<p>Over the next year, I&rsquo;d like to see more progress on how the project
operates. Some of the goals I think we should be working towards:</p>
<ul>
<li>active mentorship to help leads formulate roadmaps and plans, as was
discussed in the <a href="https://rust-lang.github.io/compiler-team/minutes/design-meeting/2019-11-16-Working-Group-Retrospective/">recent compiler-team retrospective</a></li>
<li>documenting all of our governance procedures and other details on <a href="https://github.com/rust-lang/core-team/tree/master/minutes/project-leadership-sync">forge</a></li>
<li>more transparency about our priorities, and a clearer process for
requesting that something be <em>made</em> into a priority</li>
<li>improving <em>followthrough</em> and avoiding <a href="http://smallcultfollowing.com/babysteps/blog/2019/07/10/aic-unbounded-queues-and-lang-design/">unbounded queues</a>; when
we start designing a feature, we should see that effort through to
the end before we pick up new things</li>
<li>extending our governance to cover &ldquo;cross-cutting projects&rdquo;, which
draw on the expertise from many teams; right now, for example, the
&ldquo;handoff&rdquo; between the lang team doing the design for a feature and
the compile team starting to implement is informal and often just
fails to happen</li>
</ul>
<h1 id="conclusion">Conclusion</h1>
<p>As I wrote in the beginning, I&rsquo;ve not tried to address technical
initiatives in this post. I have thoughts on those too, and I think I
will try to do some follow-ups there. In summary, for Rust 2020, I
believe:</p>
<ul>
<li>We should do a 2021 Edition, and we should start the planning now.</li>
<li>We&rsquo;ve succeeded at encouraging Rust <em>adoption</em>, and we should start thinking about encouraging <em>investment</em>.</li>
<li>Improving how the Rust organization operates continues to be a
pressing need, and will help everything, including investment.</li>
</ul>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/></entry><entry><title type="html">Async Interview #1: Alex and Nick talk about async I/O and WebAssembly</title><link href="https://smallcultfollowing.com/babysteps/blog/2019/11/28/async-interview-1-alex-and-nick-talk-about-async-i-o-and-webassembly/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2019/11/28/async-interview-1-alex-and-nick-talk-about-async-i-o-and-webassembly/</id><published>2019-11-28T00:00:00+00:00</published><updated>2019-11-28T00:00:00+00:00</updated><content type="html"><![CDATA[<p>Hello from Iceland! (I&rsquo;m on vacation.) I&rsquo;ve just uploaded [the first
of the Async Interviews][video] to YouTube. It is a conversation with Alex
Crichton (<a href="https://github.com/alexcrichton">alexcrichton</a>) and Nick Fitzgerald (<a href="https://github.com/fitzgen">fitzgen</a>) about how
WebAssembly and Rust&rsquo;s Async I/O system interact. When you watch it,
you will probably notice two things:</p>
<ul>
<li>First, I spent a lot of time looking off to the side! This is
because I had the joint Dropbox paper document open on my side
monitor and I forgot how strange that would look. I&rsquo;ll have to
remember that for the future. =)</li>
<li>Second, we recorded this on October 3rd<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>, which was before
async-await had landed on stable. So at various points we talk about
async-await being on beta or not yet being stable. Don&rsquo;t be
confused. =)</li>
</ul>
<h3 id="video">Video</h3>
<p>You can view the [video][video] on YouTube, but it is also embedded
here if that&rsquo;s easier for you.</p>
<center><iframe width="560" height="315" src="https://www.youtube.com/embed/vR0Ry830Hw8" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe></center>
<h3 id="rust-futures-meet-javascript-promises">Rust futures, meet JavaScript promises</h3>
<p>The first part of the chat focused on the interaction of Rust futures
with JavaScript&rsquo;s promises. As fitzgen points out early on, on the web
platform, there is no notion of synchronous I/O &ndash; only async<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>.
So if you want to execute Rust in the browser, you need to be using
async I/O. Fortunately, the [current tooling makes this pretty easy][rw]!</p>
<p>The <a href="https://github.com/rustwasm/wasm-bindgen/blob/df34cf843eca7478e3879562670e52c889e32fdf/examples/fetch/src/lib.rs">fetch example</a> from the <a href="https://github.com/rustwasm/wasm-bindgen">wasm-bindgen</a> site kind of shows off
some of what is possible. The example shows a Rust function that
downloads content from the web by invoking the <a href="https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API">fetch</a> function. To
start, it shows you can <a href="https://github.com/rustwasm/wasm-bindgen/blob/df34cf843eca7478e3879562670e52c889e32fdf/examples/fetch/src/lib.rs#L35-L36">write a Rust <code>async fn</code> that gets exported to
JavaScript</a>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cp">#[wasm_bindgen]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">run</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;</span><span class="n">JsValue</span><span class="p">,</span><span class="w"> </span><span class="n">JsValue</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>When <a href="https://github.com/rustwasm/wasm-bindgen/blob/df34cf843eca7478e3879562670e52c889e32fdf/examples/fetch/index.js#L5">users invoke this <code>run</code> function from
JS</a>,
it acts just like a JavaScript asynchronous function, which means that
it returns a JS promise. This is possible because wasm-bindgen
includes the ability to interconvert between JS and Rust promises. You can see
this at play also within the <code>run</code> function, which <a href="https://github.com/rustwasm/wasm-bindgen/blob/df34cf843eca7478e3879562670e52c889e32fdf/examples/fetch/src/lib.rs#L50-L51">invokes <code>fetch</code> and then
converts the result into a <code>Rust</code> future</a>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">window</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">web_sys</span>::<span class="n">window</span><span class="p">().</span><span class="n">unwrap</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">resp_value</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">JsFuture</span>::<span class="n">from</span><span class="p">(</span><span class="n">window</span><span class="p">.</span><span class="n">fetch_with_request</span><span class="p">(</span><span class="o">&amp;</span><span class="n">request</span><span class="p">)).</span><span class="k">await</span><span class="o">?</span><span class="p">;</span><span class="w">
</span></span></span></code></pre></div><p>Note the <code>JsFuture::from</code> call in particular, which converts the
JavaScript promise into a Rust future that can then be awaited.  (You
can go the other way using <a href="https://docs.rs/wasm-bindgen-futures/0.4.1/wasm_bindgen_futures/fn.future_to_promise.html"><code>future_to_promise</code></a>, which converts a
Rust future into a JavaScript promise.)</p>
<p>As you can see from this example, a lot of the basic tooling for
interoperating with JavaScript futures exists, but it&rsquo;s also still at
a fairly low-level. The <a href="https://crates.io/crates/web-sys">web-sys</a> crate used in the <a href="https://github.com/rustwasm/wasm-bindgen/blob/df34cf843eca7478e3879562670e52c889e32fdf/examples/fetch/src/lib.rs">fetch example</a>
exports all the basic APIs, but does so in an untyped fashion, which
means that using them from Rust can be error-prone. The <a href="https://crates.io/crates/gloo">gloo</a> crate
is an attempt to build a more idiomatic layer atop, but that is still
fairly close to the JS APIs. Crates like <a href="https://github.com/http-rs/surf">surf</a> offer a higher-level
alternative. <a href="https://github.com/http-rs/surf">surf</a> is an interface for fetching things off the web
(think <a href="https://curl.haxx.se/libcurl/"><code>libcurl</code></a>), and it <a href="https://docs.rs/surf/1.0.3/surf/#features">can be compiled</a> to use a number of
backends, including the JS <code>fetch</code> API (which only works when
compiling to webassembly, of course).</p>
<p>If you&rsquo;re interested to learn more, here are some links to get you started:</p>
<ul>
<li><a href="https://docs.rs/wasm-bindgen-futures/0.4.1/wasm_bindgen_futures/">wasm-bindgen-futures API docs</a></li>
<li><a href="https://rustwasm.github.io/wasm-bindgen/reference/js-promises-and-rust-futures.html">JS Promises and Rust futures from the wasm-bindgen reference</a></li>
<li>the <a href="https://github.com/rustwasm/wasm-bindgen">wasm-bindgen</a> repository</li>
</ul>
<h3 id="webassembly-outside-the-browser">WebAssembly outside the browser</h3>
<p>Next we discussed what it mean to use Async I/O <em>outside</em> of the
browser. This part of the space is much less developed. When you are
running outside the browser, there aren&rsquo;t standard APIs like <a href="https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API">fetch</a>
to build off of. This is where <a href="https://wasi.dev/">WASI</a> comes in. It is an effort to
build up a standardized set of APIs that WebAssembly apps can run
against which will work both inside and outside the browser. (To learn
more about <a href="https://wasi.dev/">WASI</a>, I recommend Lin Clark&rsquo;s excellent <a href="https://hacks.mozilla.org/2019/03/standardizing-wasi-a-webassembly-system-interface/">Mozilla Hacks
blog
post</a>.)</p>
<p>At this point, the conversation turned much more speculative. WASI
doesn&rsquo;t yet include any asynchronous APIs, but it seems clear that
they will be needed. Alex and Nick felt that some of the things we&rsquo;ve
learned in the Rust Async I/O effort would likely inform the design
going forward. For example, WASI might export something vaguely epoll
or mio-like, and let the languages supply the higher-level APIs that
build on that. Still, there has also been talk of including direct
support in WASI for protocols like HTTP, which might necessitate a
different sort of interface altogether.</p>
<p>One of the interesting things we discussed is that WASI is very
<em>capability driven</em>. Typically in programming languages, we aim to
expose a small set of powerful, flexible primitives that can be used
to do almost anything &ndash; but this has downsides, too. Offering
higher-level options means you can build a more effective sandbox. For
example, it might be useful to have some way to hand-off an open HTTP
socket to a WASM<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup> program, which would mean that it can read and write
through that connection, but that it cannot necessarily create new
HTTP connections to other servers, or speak other protocols.</p>
<h3 id="what-does-this-mean-for-rust-async">What does this mean for Rust async?</h3>
<p>In the last few minutes, we turned our discussion to Rust async
itself. What developments in Rust async would be most helpful to WASM?
Since there are still so many unknowns, especially when it comes to
WASI, this is a bit hard to say for sure, but Nick and Alex had a few
things to say:</p>
<ul>
<li>First, in terms of JS interop, building more crates like <a href="https://github.com/http-rs/surf">surf</a>,
which can be configured to target web APIs, so that people can write
Rust programs that work equally well on the web or running natively.</li>
<li>Second, to support that effort, better support for async fn in
traits (and perhaps other language primitives). The basic idea is
that, on the web, async fn is just the only way to do I/O &ndash; this
means we need it to work seamlessly across all of Rust&rsquo;s language
features. (As usual, I&rsquo;ll refer folks to dtolnay&rsquo;s <a href="https://github.com/dtolnay/async-trait">async-trait</a>
crate, which is presently the best way to write crates that use
async fn, for a <a href="http://smallcultfollowing.com/babysteps/blog/2019/10/26/async-fn-in-traits-are-hard/">host of reasons</a>.)</li>
<li>Finally, support for async streams (and async generator functions),
so as to interoperate with their JavaScript equivalents.</li>
</ul>
<h3 id="comments">Comments?</h3>
<p>There is a <a href="https://users.rust-lang.org/t/async-interviews/35167">thread on the Rust users forum</a> for questions and
discussion.</p>
<h3 id="footnotes">Footnotes</h3>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Yeah&hellip; I&rsquo;ve been planning to start these async interviews for a while! Just been busy.
[video]: <a href="https://youtu.be/vR0Ry830Hw8">https://youtu.be/vR0Ry830Hw8</a>&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>We do not speak of [synchronous XHR] on this blog. Or ever.
[synchronous XHR]: <a href="https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest/Synchronous_and_Asynchronous_Requests#Synchronous_request">https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest/Synchronous_and_Asynchronous_Requests#Synchronous_request</a>
[rw]: <a href="https://rustwasm.github.io/wasm-bindgen/reference/js-promises-and-rust-futures.html">https://rustwasm.github.io/wasm-bindgen/reference/js-promises-and-rust-futures.html</a>&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p><a href="https://webassembly.github.io/threads/intro/introduction.html#wasm">According to the spec</a>, &ldquo;WASM&rdquo; is a contraction and hence should not be capitalized. I however maintain that it plainly fits <a href="https://www.collinsdictionary.com/dictionary/english/acronym">the definition of an acronym</a>. Moreover, if it were a contraction, it would be written w&rsquo;asm&rsquo;, and nobody wants that. So I&rsquo;m going with WASM. Sue me.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/asyncinterviews" term="asyncinterviews" label="AsyncInterviews"/></entry><entry><title type="html">Announcing the Async Interviews</title><link href="https://smallcultfollowing.com/babysteps/blog/2019/11/22/announcing-the-async-interviews/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2019/11/22/announcing-the-async-interviews/</id><published>2019-11-22T00:00:00+00:00</published><updated>2019-11-22T00:00:00+00:00</updated><content type="html"><![CDATA[<p>Hello all! I&rsquo;m going to be trying something new, which I call the
<strong>&ldquo;Async Interviews&rdquo;</strong>. These interviews are going to be a series of
recorded video calls with various &ldquo;luminaries&rdquo; from Rust&rsquo;s Async I/O
effort. In each one, I&rsquo;m going to be asking roughly the same question:
<strong>Now that the async-await MVP is stable, what should we be doing
next?</strong> After each call, I&rsquo;ll post the recording from the interview,
along with a blog post that leaves a brief summary.</p>
<p>My intention in these interviews is to really get into details. That
is, I want to talk about what our big picture goals should be, but
also what the <em>specific concerns</em> are around stabilizing particular
traits or macros. What sorts of libraries do they enable? And so
forth. (You can view my rough <a href="https://gist.github.com/nikomatsakis/ae2ede32c4c7d49cbda088a1539724d9">interview script</a>, but I plan to tailor
the meetings as I go.)</p>
<p>I view these interviews as serving a few purposes:</p>
<ul>
<li>Help to survey what different folks are thinking and transmit that
thinking out to the community.</li>
<li>Help me to understand better what some of the tradeoffs are,
especially around discussions that occurred before I was following
closely.</li>
<li>Experiment with a new form of Rust discussion, where we substitute
1-on-1 exploration and discussion for bigger discussion threads.</li>
</ul>
<h3 id="first-video-rust-and-webassembly">First video: Rust and WebAssembly</h3>
<p>The first video in this series, which I expect to post next week, will
be me chatting with <strong>Alex Crichton</strong> and <strong>Nick Fitzgerald</strong> about
<strong>Async I/O and WebAssembly</strong>. This video is a bit different from the
others, since it&rsquo;s still early days in that area &ndash; as a result, we
talked more about what role Async I/O (and Rust!) might eventually
play, and less about immediate priorities for Rust. Along with the
video, I&rsquo;ll post a blog post summarizing the main points that came up
in the conversation, so you don&rsquo;t necessarily have to watch the video
itself.</p>
<h3 id="what-videos-will-come-after-that">What videos will come after that?</h3>
<p>My plan is to be posting a fresh async interview roughly once a week.
I&rsquo;m not sure how long I&rsquo;ll keep doing this &ndash; I guess as long as it
seems like I&rsquo;m still learning things. I&rsquo;ll announce the people I plan
to speak to as I go, but I&rsquo;m also very open to suggestions!</p>
<p>I&rsquo;d like to talk to folks who are working on projects at all levels of
the &ldquo;async stack&rdquo;, such as runtimes, web frameworks, protocols, and
consumers thereof. If you can think of a project or a person that you
think would provide a useful perspective, I&rsquo;d love to hear about
it. Drop me a line via e-mail or on Zulip or Discord.</p>
<h3 id="creating-design-notes">Creating design notes</h3>
<p>One thing that I have found in trying to get up to speed on the design
of Async I/O is that the discussions are often quite distributed,
spread amongst issues, RFCs, and the like. I&rsquo;d like to do a better job
of organizing this information.</p>
<p>Therefore, as part of this effort to talk to folks, one of the things
I plan to be doing is to collect and catalog the concerns, issues, and
unknowns that are being brought up. <strong>I&rsquo;d love to find people to help
in this effort!</strong> If that is something that interests you, come join
the <a href="https://rust-lang.zulipchat.com/#narrow/stream/187312-wg-async-foundations">#wg-async-foundations stream</a> on <a href="https://rust-lang.zulipchat.com">the rust-lang Zulip</a> and say
hi!</p>
<h3 id="so-what-are-the-things-we-might-do-now-that-async-await-is-stable">So what <em>are</em> the things we might do now that async-await is stable?</h3>
<p>If you take a look at my rough <a href="https://gist.github.com/nikomatsakis/ae2ede32c4c7d49cbda088a1539724d9">interview script</a>, you&rsquo;ll see a long
list of possibilities. But I think they break down into two big
categories:</p>
<ul>
<li>improving interoperability</li>
<li>extending expressive power, convenience, and ergonomics</li>
</ul>
<p>Let&rsquo;s look a bit more at those choices.</p>
<h3 id="improving-interoperability">Improving interoperability</h3>
<p>A long time back, Rust actually had a built-in green-threading
library.  It was removed in <a href="https://gist.github.com/nikomatsakis/ef21d903717ef20b8bbf4ae5c1c03ba0">RFC #230</a>, and a big part of the
motivation was that we knew we were unlikely to find a <em>single runtime
design</em> that was useful for all tasks. And, even if we could, we
certainly knew we hadn&rsquo;t found it <em>yet</em>. Therefore, we opted to pare
back the stdlib to just expose the primitives that the O/S had to
offer.</p>
<p>Learning from this, our current design is intentionally much more
&ldquo;open-ended&rdquo; and permits runtimes to be added as simple crates on
crates.io. Right now, to my knowledge, we have at least five distinct
async runtimes for Rust, and I wouldn&rsquo;t be surprised if I&rsquo;ve forgotten
a few:<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<ul>
<li><a href="https://fuchsia.googlesource.com/">fuschia&rsquo;s runtime</a>, used for the Fuschia work at Google;</li>
<li><a href="https://tokio.rs/">tokio</a>, a venerable, efficient runtime with a rich feature set;</li>
<li><a href="https://async.rs/">async-std</a>, a newer contender which aims to couple libstd-like APIs
with highly efficient primitives;</li>
<li><a href="https://bastion.rs/">bastion</a>, exploring a resilient, Erlang-like model<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>;</li>
<li><a href="https://github.com/Nemo157/embrio-rs">embrio-rs</a>, exploring the embedded space.</li>
</ul>
<p>I think this is great: I love to see people experimenting with
different tradeoffs and priorities. Not only do I think we&rsquo;ll wind up
with better APIs and more efficient implementations, this also means
we can target &rsquo;exotic&rsquo; environments like the Fuschia operating system
or smaller embedded platforms. Very cool stuff.</p>
<p>However, that flexibility does come with some real risks. Most
notably, I want us to be sure that it is possible to &ldquo;mix and match&rdquo;
libraries from the ecosystem. No matter what base runtime you are
using, it should be possible to take a protocol implementation like
<a href="https://github.com/djc/quinn">quinn</a>, combine it with &ldquo;middleware&rdquo; crates like <a href="https://github.com/Nemo157/async-compression">async-compression</a>,
and starting sending payloads.</p>
<p>In my mind, the best way to ensure interoperability is to ensure that
we offer standard traits that define the interfaces between
libraries. Adding the <code>std::Future</code> trait was a huge step in this
direction &ndash; it means that you can create all kinds of combinators and
things that are fully portable between runtimes. But what are the
<em>next</em> steps we can take to help improve things further?</p>
<p>One obvious set of things we can do improve interop is to try and
stabilize additional traits. Currently, the futures crate contains a
number of interfaces that have been evolving over time, such as
<a href="https://docs.rs/futures/0.3.1/futures/stream/trait.Stream.html"><code>Stream</code></a>, <a href="https://docs.rs/futures/0.3.1/futures/io/trait.AsyncRead.html"><code>AsyncRead</code></a>, and <a href="https://docs.rs/futures/0.3.1/futures/io/trait.AsyncWrite.html"><code>AsyncWrite</code></a>. Maybe some of these
traits are good candidates to be moved to the standard library next?</p>
<p>Here are some of the main things I&rsquo;d like to discuss around interop:</p>
<ul>
<li>As a meta-point, should we be moving the crates to the standard
library, or should we move try to promote the futures crate (or,
more likely, some of its subcrates, such as
<a href="https://docs.rs/futures-io/0.3.1/futures_io/">futures-io</a>) as the
standard for interop? I&rsquo;ve found from talking to folks that there is
a fair amount of confusion on &ldquo;how standard&rdquo; the futures crates are
and what the plan is there.</li>
<li>Regardless of how we signal stability, I also want to talk about the
specific traits or other things we might stabilizing. For each such item,
there are two things I&rsquo;d like to drill into:
<ul>
<li>What kinds of interop would be enabled by stabilizing this
item? What are some examples of the sorts of libraries that
could now exist independently of a runtime because of the
existence of this item?</li>
<li>What are the specific concerns that remain about the design of
this item? The <a href="https://docs.rs/futures/0.3.1/futures/io/trait.AsyncRead.html"><code>AsyncRead</code></a> and <a href="https://docs.rs/futures/0.3.1/futures/io/trait.AsyncWrite.html"><code>AsyncWrite</code></a> traits, for
example, presently align quite closely with their synchronous
counterparts <a href="https://doc.rust-lang.org/std/io/trait.Read.html"><code>Read</code></a> and <a href="https://doc.rust-lang.org/std/io/trait.Write.html"><code>Write</code></a>. However, this interface
does require that the buffer used to store data must be
zeroed. The <a href="https://tokio.rs/">tokio</a> crate is <a href="https://github.com/tokio-rs/tokio/pull/1744">considering altering its own local
definition of <code>AsyncRead</code></a> for this reason, is that
something we should consider as well? If so, how?</li>
</ul>
</li>
<li>On a broader note, what are the sorts of things crates need to truly
operate that are <em>not</em> covered by the existing traits? For example,
the <a href="https://boats.gitlab.io/blog/post/global-executors/">global executors</a> that boats recently proposed would give
people the ability to &ldquo;spawn tasks&rdquo; into some ambient context&hellip; is
that a capability that would enable more interop? Perhaps access to
task-local data? Inquiring minds want to know.</li>
</ul>
<h3 id="improving-expressive-power-convenience-and-ergonomics">Improving expressive power, convenience, and ergonomics</h3>
<p>Interoperability isn&rsquo;t the only thing that we might try to improve.
We might also focus on language extensions that either grow our
expressive power or add convenience and ergonomics. Something like
supporting async fn in traits or async closures, for example, could be
a huge enabler, even if there are some real difficulties to making
them work.</p>
<p>Here are some of the specific features we might discuss:</p>
<ul>
<li><strong>Async destructors.</strong> As boats described <a href="https://boats.gitlab.io/blog/post/poll-drop/">in this blog post</a>,
there is sometimes a need to &ldquo;await&rdquo; things when running
destructors, and our current system can&rsquo;t support that.</li>
<li><strong>Async fn in traits.</strong> We support <code>async fn</code> in free functions and
inherent methods, but not in traits. As I explained in <a href="http://smallcultfollowing.com/babysteps/blog/2019/10/26/async-fn-in-traits-are-hard/">this blog
post</a>, there are a lot of challenges to support async fn in
traits properly (but consider using <a href="https://crates.io/crates/async-trait">the <code>async-trait</code> crate</a>).</li>
<li><strong>Async closures.</strong> Currently, we support async <em>blocks</em> (<code>async move { .. }</code>), which evaluate to a future, and async <em>functions</em>
(<code>async fn foo()</code>), which are a function that returns a future. But,
at least on stable, we have no way to make a <em>closure</em> that returns
a future. Presumably this would be something like <code>async || { ... }</code>. (In fact, on nightly, we do have support for async
closures, but there are some issues in the design that we need to
work out.)</li>
<li><strong>Combinator methods like <a href="https://docs.rs/futures/0.3.1/futures/future/trait.FutureExt.html#method.map"><code>map</code></a>, or macros like <a href="https://docs.rs/futures/0.3.1/futures/macro.join.html"><code>join!</code></a> and
<a href="https://docs.rs/futures/0.3.1/futures/macro.select.html"><code>select!</code></a>.</strong> The futures crate offers a number of useful combinators
and macros. Maybe we should move some of those to the standard
library?</li>
</ul>
<h3 id="conclusion">Conclusion</h3>
<p>I think these interviews are going to be a lot of fun, and I expect to
learn a lot. Stay tuned for the first blog post, coming next week,
about <strong>Async I/O and WebAssembly</strong>.</p>
<h3 id="comments">Comments?</h3>
<p>There is a <a href="https://users.rust-lang.org/t/async-interviews/35167">thread on the Rust users forum</a> for questions and
discussion.</p>
<h3 id="footnotes">Footnotes</h3>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Indeed, shortly after I published this post, I <a href="https://twitter.com/redtwitdown/status/1198001288648069120">was directed</a> to the <a href="https://www.drone-os.com/">drone-os</a> project.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Woohoo! I just want to say that I&rsquo;ve been hoping to see something like OTP for Rust for&hellip;quite some time.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/asyncinterviews" term="asyncinterviews" label="AsyncInterviews"/></entry><entry><title type="html">why async fn in traits are hard</title><link href="https://smallcultfollowing.com/babysteps/blog/2019/10/26/async-fn-in-traits-are-hard/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2019/10/26/async-fn-in-traits-are-hard/</id><published>2019-10-26T00:00:00+00:00</published><updated>2019-10-26T00:00:00+00:00</updated><content type="html"><![CDATA[<p>After reading <a href="https://boats.gitlab.io/blog/post/poll-drop/">boat&rsquo;s excellent post on asynchronous destructors</a>,
I thought it might be a good idea to write some about <code>async fn</code> in
traits. Support for <code>async fn</code> in traits is probably the single most
common feature request that I hear about. It&rsquo;s also one of the more
complex topics. So I thought it&rsquo;d be nice to do a blog post kind of
giving the &ldquo;lay of the land&rdquo; on that feature &ndash; what makes it
complicated?  What questions remain open?</p>
<p>I&rsquo;m not making any concrete proposals in this post, just laying out
the problems. But do not lose hope! In a future post, I&rsquo;ll lay out a
specific roadmap for how I think we can make incremental progress
towards supporting async fn in traits in a useful way. And, in the
meantime, you can use the <a href="https://crates.io/crates/async-trait"><code>async-trait</code></a> crate (but I get ahead of
myself&hellip;).</p>
<h2 id="the-goal">The goal</h2>
<p>In some sense, the goal is simple. We would like to enable you to
write traits that include <code>async fn</code>. For example, imagine we have
some <code>Database</code> trait that lets you do various operations against a
database, asynchronously:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Database</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">get_user</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">User</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h2 id="today-you-should-use-async-trait">Today, you should use async-trait</h2>
<p>Today, of course, the answer is that you should dtolnay&rsquo;s
excellent <a href="https://crates.io/crates/async-trait"><code>async-trait</code></a> crate. This allows you to write
almost what we wanted:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cp">#[async_trait]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Database</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">get_user</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">User</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>But what is really happening under the hood? As the <a href="https://github.com/dtolnay/async-trait#explanation">crate&rsquo;s
documentation explains</a>, this declaration is getting transformed to
the following. Notice the return type.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Database</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">get_user</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Pin</span><span class="o">&lt;</span><span class="nb">Box</span><span class="o">&lt;</span><span class="k">dyn</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">User</span><span class="o">&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Send</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">&#39;_</span><span class="o">&gt;&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>So basically you are returning a boxed <code>dyn Future</code> &ndash; a future
object, in other words. This desugaring is rather different from what
happens with <code>async fn</code> in other contexts &ndash; but why is that? The rest
of this post is going to explain some of the problems that <code>async fn</code>
in traits is trying to solve, which may help explain why we have a
need for the <a href="https://crates.io/crates/async-trait"><code>async-trait</code></a> crate to begin with!</p>
<h2 id="async-fn-normally-returns-an-impl-future">Async fn normally returns an impl Future</h2>
<p>We saw that the <a href="https://crates.io/crates/async-trait"><code>async-trait</code></a> crate converts an <code>async fn</code> to something
that returns a <code>dyn Future</code>. This is contrast to the <code>async fn</code> desugaring
that the Rust compiler uses, which produces an <code>impl Future</code>. For example,
imagine that we have an inherent method <code>async fn get_user()</code> defined on
some particular service type:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">MyDatabase</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">get_user</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">User</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This would get desugared to something similar to:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">MyDatabase</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">get_user</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">impl</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">User</span><span class="o">&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">&#39;_</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">..</span><span class="p">.</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>So why does <code>async-trait</code> do something different? Well, it&rsquo;s
because of &ldquo;Complication #1&rdquo;&hellip;</p>
<h2 id="complication-1-returning-impl-trait-in-traits-is-not-supported">Complication #1: returning <code>impl Trait</code> in traits is not supported</h2>
<p>Currently, we don&rsquo;t support <code>-&gt; impl Trait</code> return types in traits.
Logically, though, we basically know what the semantics of such a
construct should be: it is equivalent to a kind of associated type.
That is, the trait is promising that invoking <code>get_user</code> will return
<em>some</em> kind of future, but the precise type will be determined by the
details of the impl (and perhaps inferred by the compiler). So, if
know <em>logically</em> how <code>impl Trait</code> in traits should behave, what stops
us from implementing it? Well, let&rsquo;s see&hellip;</p>
<h3 id="complication-1a-impl-trait-in-traits-requires-gats">Complication #1a. <code>impl Trait</code> in traits requires GATs</h3>
<p>Let&rsquo;s return to our <code>Database</code> example. Imagine that we permitted
<code>async fn</code> in traits. We would therefore desugar</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Database</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">get_user</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">User</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>into something that returns an <code>impl Future</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Database</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">get_user</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">impl</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">User</span><span class="o">&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">&#39;_</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>and then we would in turn desugar <em>that</em> into something that uses
an associated type:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Database</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">GetUser</span><span class="o">&lt;</span><span class="na">&#39;s</span><span class="o">&gt;</span>: <span class="nc">Future</span><span class="o">&lt;</span><span class="n">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">User</span><span class="o">&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="na">&#39;s</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">get_user</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">GetUser</span><span class="o">&lt;</span><span class="nb">&#39;_</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Hmm, did you notice that I wrote <code>type GetUser&lt;'s&gt;</code>, and not <code>type GetUser</code>?  Yes, that&rsquo;s right, this is not just an associated type,
it&rsquo;s actually a <a href="https://github.com/rust-lang/rfcs/blob/master/text/1598-generic_associated_types.md"><strong>generic</strong> associated type</a>. The reason for
this is that <code>async fn</code> always capture all of their arguments &ndash; so
whatever type we return will include the <code>&amp;self</code> as part of it, and
therefore it has to include the lifetime <code>'s</code>. So, that&rsquo;s one
complication, we have to figure out generic associated types.</p>
<p>Now, in some sense that&rsquo;s not so bad. Conceptually, GATs are fairly
simple. Implementation wise, though, we&rsquo;re still working on how to
support them in rustc &ndash; this may require porting rustc to use
<a href="https://github.com/rust-lang/chalk">chalk</a>, though that&rsquo;s not entirely clear. In any case, this work is
definitely underway, but it&rsquo;s going to take more time.</p>
<p>Unfortunately for us, GATs are only the beginning of the complications
around <code>async fn</code> (and <code>impl Trait</code>) in traits!</p>
<h2 id="complication-2-send-bounds-and-other-bounds">Complication #2: send bounds (and other bounds)</h2>
<p>Right now, when you write an <code>async fn</code>, the resulting future may or
may not implement <code>Send</code> &ndash; the result depends on what state it
captures. The compiler infers this automatically, basically, in
typical auto trait fashion.</p>
<p>But if you are writing generic code, you may well want to need to
require that the resulting future is <code>Send</code>. For example, imagine we
are writing a <code>finagle_database</code> thing that, as part of its inner
working, happens to spawn off a parallel thread to get the current
user. Since we&rsquo;re going to be spawning a thread with the result from
<code>d.get_user()</code>, that result is going to have to be <code>Send</code>, which means
we&rsquo;re going to want to write a function that looks <em>something</em> like
this<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">finagle_database</span><span class="o">&lt;</span><span class="n">D</span>: <span class="nc">Database</span><span class="o">&gt;</span><span class="p">(</span><span class="n">d</span>: <span class="kp">&amp;</span><span class="nc">D</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="o">&lt;</span><span class="na">&#39;s</span><span class="o">&gt;</span><span class="w"> </span><span class="n">D</span>::<span class="n">GetUser</span><span class="o">&lt;</span><span class="na">&#39;s</span><span class="o">&gt;</span>: <span class="nb">Send</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">spawn</span><span class="p">(</span><span class="n">d</span><span class="p">.</span><span class="n">get_user</span><span class="p">());</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This example seems &ldquo;ok&rdquo;, but there are four complications</p>
<ul>
<li>First, we wrote the name <code>GetUser</code>, but that is something we
introduced as part of &ldquo;manually&rdquo; desugaring <code>async fn get_user</code>. What name would the user <em>actually</em> use?</li>
<li>Second, writing <code>for&lt;'s&gt; D::GetUser&lt;'s&gt;</code> is kind of grody, we&rsquo;re obviously
going to want more compact syntax (this is really an issue around generic
associated types in general).</li>
<li>Third, our example <code>Database</code> trait has only one async fn, but
obviously there might be many more. Probably we will want to make
<em>all</em> of them <code>Send</code> or <code>None</code> &ndash; so you can expand a lot more
grody bounds in a real function!</li>
<li>Finally, forcing the user to specify which exact async fns have to
return <code>Send</code> futures is a semver hazard.</li>
</ul>
<p>Let me dig into those a bit.</p>
<h3 id="complication-2a-how-to-name-the-associated-type">Complication #2a. How to name the associated type?</h3>
<p>So we saw that, in a trait, returning an <code>impl Trait</code> value is
equivalent to introducing a (possibly generic) associated type. But
how should we <em>name</em> this associated type? In my example, I introduced
a <code>GetUser</code> associated type as the result of the <code>get_user</code>
function. Certainly, you could imagine a rule like &ldquo;take the name of
the function and convert it to camel case&rdquo;, but it feels a bit hokey
(although I suspect that, in practice, it would work out just
fine). There have been other proposals too, such as <code>typeof</code>
expressions and the like.</p>
<h3 id="complication-2b-grody-complex-bounds-especially-around-gats">Complication #2b. Grody, complex bounds, especially around GATs.</h3>
<p>In my example, I used the strawman syntax <code>for&lt;'s&gt; D::GetUser&lt;'s&gt;: Send</code>.  In real life, unfortunately, the bounds you need may well get
more complex still. Consider the case where an <code>async fn</code> has generic
parameters itself:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">bar</span><span class="o">&lt;</span><span class="n">A</span><span class="p">,</span><span class="w"> </span><span class="n">B</span><span class="o">&gt;</span><span class="p">(</span><span class="n">a</span>: <span class="nc">A</span><span class="p">,</span><span class="w"> </span><span class="n">b</span>: <span class="nc">B</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Here, the future that results <code>bar</code> is only going to be <code>Send</code> if <code>A: Send</code> and <code>B: Send</code>. This suggests a bound like</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="o">&lt;</span><span class="n">A</span>: <span class="nb">Send</span><span class="p">,</span><span class="w"> </span><span class="n">B</span>: <span class="nb">Send</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">S</span>::<span class="n">bar</span><span class="o">&lt;</span><span class="n">A</span><span class="p">,</span><span class="w"> </span><span class="n">B</span><span class="o">&gt;</span>: <span class="nb">Send</span> <span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>From a conceptual point-of-view, bounds like these are no problem.
Chalk can handle them just fine, for example. But I think this is
pretty clearly a problem and not something that ordinary users are
going to want to write on a regular basis.</p>
<h3 id="complication-2c-listing-specific-associated-types-reveals-implementation-details">Complication #2c. Listing specific associated types reveals implementation details</h3>
<p>If we require functions to specify the <em>exact</em> futures that are
<code>Send</code>, that is not only tedious, it could be a semver
hazard. Consider our <code>finagle_database</code> function &ndash; from its where
clause, we can see that it spawns out <code>get_user</code> into a scoped
thread. But what if we wanted to modify it in the future to spawn off
more database operations? That would require us to modify the
where-clauses, which might in turn break our callers. Seems like a
problem, and it suggests that we might want some way to say &ldquo;all
possible futures are send&rdquo;.</p>
<h3 id="conclusion-we-might-want-a-new-syntax-for-propagating-auto-traits-to-async-fns">Conclusion: We might want a new syntax for propagating auto traits to async fns</h3>
<p>All of this suggests that we might want some way to propagate auto
traits through to the results of async fns explicitly. For example,
you could imagine supporting <code>async</code> bounds, so that we might write
<code>async Send</code> instead of just <code>Send</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">finagle_database</span><span class="o">&lt;</span><span class="no">DB</span><span class="o">&gt;</span><span class="p">(</span><span class="n">t</span>: <span class="nc">DB</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="no">DB</span>: <span class="nc">Database</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="k">async</span><span class="w"> </span><span class="nb">Send</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This syntax would be some kind of &ldquo;default&rdquo; that expands to explicit
<code>Send</code> bounds both <code>DB</code> and all the futures potentially returned by
<code>DB</code>.</p>
<p>Or perhaps we&rsquo;d even want to avoid <em>any</em> syntax, and somehow
&ldquo;rejigger&rdquo; how <code>Send</code> works when applied to traits that contain async
fns? I&rsquo;m not sure about how that would work.</p>
<p>It&rsquo;s worth pointing out this same problem can occur with <code>impl Trait</code>
in return position<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>, or indeed any associaed
types. Therefore, we might prefer a syntax that is more general and
not tied to <code>async</code>.</p>
<h2 id="complication-3-supporting-dyn-traits-that-have-async-fns">Complication #3: supporting dyn traits that have async fns</h2>
<p>Now imagine that had our <code>trait Database</code>, containing an <code>async fn get_user</code>. We might like to write functions that operate over <code>dyn Database</code>
values. There are many reasons to prefer <code>dyn Database</code> values:</p>
<ul>
<li>We don&rsquo;t want to generate many copies of the same function, one per database type;</li>
<li>We want to have collections of different sorts of databases, such as a
<code>Vec&lt;Box&lt;dyn Database&gt;&gt;</code> or something like that.</li>
</ul>
<p>In practice, a desire to support <code>dyn Trait</code> comes up in a lot of examples
where you would want to use <code>async fn</code> in traits.</p>
<h3 id="complication-3a-dyn-trait-have-to-specify-their-associated-type-values">Complication #3a: <code>dyn Trait</code> have to specify their associated type values</h3>
<p>We&rsquo;ve seen that <code>async fn</code> in traits effectively desugars to a
(generic) associated type. And, under the current Rust rules, when you
have a <code>dyn Trait</code> value, the type must specify the values for all
associated types.  If we consider our desugared <code>Database</code> trait, then,
it would have to be written <code>dyn Database&lt;GetUser&lt;'s&gt; = XXX&gt;</code>. This is
obviously no good, for two reasons:</p>
<ol>
<li>It would require us to write out the full type for the <code>GetUser</code>,
which might be super complicated.</li>
<li>And anyway, each <code>dyn Database</code> is going to have a <em>distinct</em>
<code>GetUser</code> type. If we have to specify <code>GetUser</code>, then, that kind of
defeats the point of using <code>dyn Database</code> in the first place, as
the type is going to be specific to some particular service, rather
than being a single type that applies to all services.</li>
</ol>
<h3 id="complication-3b-no-right-choice-for-x-in-dyn-databasegetusers--x">Complication #3b: no &ldquo;right choice&rdquo; for <code>X</code> in <code>dyn Database&lt;GetUser&lt;'s&gt; = X&gt;</code></h3>
<p>When we&rsquo;re using <code>dyn Database</code>, what we actually want is a type where
<code>GetUser</code> is <strong>not specified</strong>. In other words, we just want to write
<code>dyn Database</code>, full stop, and we want that to be expanded to
something that is perhaps &ldquo;morally equivalent&rdquo; to this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">dyn</span><span class="w"> </span><span class="n">Database</span><span class="o">&lt;</span><span class="n">GetUser</span><span class="o">&lt;</span><span class="na">&#39;s</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">dyn</span><span class="w"> </span><span class="n">Future</span><span class="o">&lt;..&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="na">&#39;s</span><span class="o">&gt;</span><span class="w">
</span></span></span></code></pre></div><p>In other words, all the caller really wants to know when it calls
<code>get_user</code> is that it gets back <em>some</em> future which it can poll. It
doesn&rsquo;t want to know exactly which one.</p>
<p>Unfortunately, actually using <code>dyn Future&lt;..&gt;</code> as the type there is
not a viable choice. We probably want a <code>Sized</code> type, so that the
future can be stored, moved into a box, etc. We could imagine then
that <code>dyn Database</code> defaults its &ldquo;futures&rdquo; to <code>Box&lt;dyn Future&lt;..&gt;&gt;</code>
instead &ndash; well, actually, <code>Pin&lt;Box&lt;dyn Future&gt;&gt;</code> would be a more
ergonomic choice &ndash; but there are a few concerns with <em>that</em>.</p>
<p>First, using <code>Box</code> seems rather arbitrary. We don&rsquo;t usually make <code>Box</code>
this &ldquo;special&rdquo; in other parts of the language.</p>
<p>Second, where would this box get allocated? The actual trait impl for
our service isn&rsquo;t using a box, it&rsquo;s creating a future type and
returning it inline. So we&rsquo;d need to generate some kind of &ldquo;shim impl&rdquo;
that applies whenever something is used as a <code>dyn Database</code> &ndash; this
shim impl would invoke the main function, box the result, and return
<em>that</em>.</p>
<p>Third, because a <code>dyn Future</code> type hides the underlying
future (that is, indeed, its entire purpose), it also blocks the auto
trait mechanism from figuring out if the result is <code>Send</code>. Therefore,
when we make e.g. a <code>dyn Database</code> type, we need to specify not only
the allocation mechanism we&rsquo;ll use to manipulate the future (i.e., do
we use <code>Box</code>?) but also whether the future is <code>Send</code> or not.</p>
<h2 id="now-you-see-why-async-trait-desugars-the-way-it-does">Now you see why async-trait desugars the way it does</h2>
<p>After reviewing all these problems, we now start to see where the
design of the <a href="https://crates.io/crates/async-trait"><code>async-trait</code></a> crate comes from:</p>
<ul>
<li>To avoid Complications #1 and #2, <a href="https://crates.io/crates/async-trait"><code>async-trait</code></a> desugars <code>async fn</code>
to return a <code>dyn Future</code> instead of an <code>impl Future</code>.</li>
<li>To avoid Complication #3, <a href="https://crates.io/crates/async-trait"><code>async-trait</code></a> chooses for you to use
a <code>Pin&lt;Box&lt;dyn Future + Send&gt;&gt;</code> (you can <a href="https://github.com/dtolnay/async-trait#non-threadsafe-futures">opt-out</a> from the <code>Send</code> part).
This is almost always the correct default.</li>
</ul>
<p>All in all, it&rsquo;s a very nice solution.</p>
<p>The only real drawback here is that there is some performance hit from
boxing the futures &ndash; but I suspect it is negligible in almost all
applications. I don&rsquo;t think this would be true if we boxed the results
of <strong>all</strong> async fns; there are many cases where async fns are used to
create small combinators, and there the boxing costs <em>might</em> start to
add up.  But only boxing async fns that go through trait boundaries is
very different.  And of course it&rsquo;s worth highlighting that most
languages box all their futures, all of the time. =)</p>
<h2 id="summary">Summary</h2>
<p>So to sum it all up, here are some of the observations from this article:</p>
<ul>
<li><code>async fn</code> desugars to a fn returning <code>impl Trait</code>, so if we want to
support <code>async fn</code> in traits, we should also support fns that
return <code>impl Trait</code> in traits.
<ul>
<li>It&rsquo;s worth pointing out also that sometimes you have to manually
desugar an <code>async fn</code> to a <code>fn</code> that returns <code>impl Future</code> to avoid
capturing all your arguments, so the two go hand in hand.</li>
</ul>
</li>
<li>Returning <code>impl Trait</code> in a trait is equivalent to an
associated type in the trait.
<ul>
<li>This associated type does need to be nameable, but what name
should we give this associated type?</li>
<li>Also, this associated type often has to be generic, especially
for <code>async fn</code>.</li>
</ul>
</li>
<li>Applying <code>Send</code> bounds to the futures that can be generated is
tedious, grody, and reveals semver details. We probably some way to
make that more ergonomic.
<ul>
<li>This quite likely applies to the general <code>impl Trait</code> case too,
but it may come up somewhat less frequently.</li>
</ul>
</li>
<li>We do want the ability to have <code>dyn Trait</code> versions of traits that contain associated
functions and/or <code>impl Trait</code> return types.
<ul>
<li>But currently we have no way to have a <code>dyn Trait</code> without fully specifying
all of its associated types; in our case, those associated types have a 1-to-1
relationship with the <code>Self</code> type, so that defeats the whole point of <code>dyn Trait</code>.</li>
<li>Therefore, in the case of <code>dyn Trait</code>, we would want to have the
<code>async fn</code> within returning <em>some</em> form of <code>dyn Future</code>. But we would have to effectively
&ldquo;hardcode&rdquo; two choices:
<ul>
<li>What form of pointer to use (e.g., <code>Box</code>)</li>
<li>Is the resulting future <code>Send</code>, <code>Sync</code>, etc</li>
</ul>
</li>
<li>This applies to the general <code>impl Trait</code> case too.</li>
</ul>
</li>
</ul>
<p>The goal of this post was just to lay out <strong>the problems</strong>. I hope to
write some follow-up posts digging a bit into the solutions &ndash; though
for the time being, the solution is clear: use the <a href="https://crates.io/crates/async-trait"><code>async-trait</code></a>
crate.</p>
<h2 id="footnotes">Footnotes</h2>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Astute readers might note that I&rsquo;m eliding a further challenge, which is that you need a scoping mechanism here to handle the lifetimes. Let&rsquo;s assume we have something like <a href="https://docs.rs/rayon/1.2.0/rayon/fn.scope.html">Rayon&rsquo;s scope</a> or <a href="https://docs.rs/crossbeam/0.7.2/crossbeam/thread/struct.Scope.html">crossbeam&rsquo;s scope</a> available.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Still, consider a trait <code>IteratorX</code> that is like <code>Iterator</code>, where the adapters return <code>impl Trait</code>. In such a case, you probably want a way to say not only &ldquo;I take a <code>T: IteratorX + Send</code>&rdquo; but also that the <code>IteratorX</code> values returned by calls to <code>map</code> and the like are <code>Send</code>. Presently you would have to list out the specific associated types you want, which also winds up revealing implementation details.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/asyncawait" term="asyncawait" label="AsyncAwait"/></entry><entry><title type="html">AiC: Shepherds 3.0</title><link href="https://smallcultfollowing.com/babysteps/blog/2019/09/11/aic-shepherds-3-0/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2019/09/11/aic-shepherds-3-0/</id><published>2019-09-11T00:00:00+00:00</published><updated>2019-09-11T00:00:00+00:00</updated><content type="html"><![CDATA[<p>I would like to describe an idea that&rsquo;s been kicking around in my
head.  I&rsquo;m calling this idea &ldquo;shepherds 3.0&rdquo; &ndash; the 3.0 is to
distinguish it from the other places we&rsquo;ve used the term in the past.
This proposal actually supplants both of the previous uses of the
term, replacing them with what I believe to be a preferred alternative
(more on that later).</p>
<h2 id="caveat">Caveat</h2>
<p>This is an idea that has been kicking around in my head for a while.
It is not a polished plan and certainly not an accepted one. I&rsquo;ve not
talked it over with the rest of the lang team, for example. However, I
wanted to put it out there for discussion, and I do think we should be
taking some step in this direction soon-ish.</p>
<h2 id="tldr">TL;DR</h2>
<p>What I&rsquo;m proposing, at its heart, is very simple. I want to better
document the &ldquo;agenda&rdquo; of the lang-team. Specifically, if we are going
to be moving a feature forward<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>, then it should have a <strong>shepherd</strong> (or
multiple) who is in charge of doing that.</p>
<p>In order to avoid <a href="http://smallcultfollowing.com/babysteps/blog/2019/07/10/aic-unbounded-queues-and-lang-design/">unbounded queues</a>, the <strong>number of things that any
individual can shepherd should be limited</strong>. Ideally, each person
should only shepherd one thing at a time, though I don&rsquo;t think we need
to make a firm rule about it.</p>
<p><strong>Becoming a shepherd is a commitment on the part of the shepherd.</strong>
The first part of the lang team meeting should be to review the items
that are being actively shepherded and get any updates. If we haven&rsquo;t
seen any movement in a while, we should consider changing the
shepherd, or officially acknowleding that something is stalled and
removing the shepherd altogether.</p>
<p><strong>Assigning a shepherd is a commitment on the part of the rest of the
lang-team as well.</strong> Before assigning a shepherd, we should discuss if
this agenda item is a priority.  In particular, if someone is
shepherding something, that means we all agree to help that item move
towards some kind of completion. This means giving feedback, when
feedback is requested.  It means doing the work to resolve concerns
and conflicts. And, sometimes, it will mean giving way. I&rsquo;ll talk more
about this in a bit.</p>
<h2 id="what-was-shepherds-10-and-how-is-this-different">What was shepherds 1.0 and how is this different?</h2>
<p>The initial use of the term shepherd, as I remember it, was actually
quite close to the way I am using it here. The idea was that we would
assign RFCs to a shepherd that should either drive to be accepted or
to be closed. This policy was, by and large, a failure &ndash; RFCs got
assigned, but people didn&rsquo;t put in the time. (To be clear, sometimes
they did, and in those cases the system worked reasonably well.)</p>
<p>My proposal here differs in a few key respects that I hope will make it
more successful:</p>
<ul>
<li>We limit how many things you can shepherd at once.</li>
<li>Assigning a shepherd is also a commitment from the lang team as a
whole to review progress, resolve conflicts, and devote some time to
the issue.</li>
<li>We don&rsquo;t try to shepherd everything &ndash; in contrast, shepherding marks
the things we are moving forward.</li>
<li>The shepherd is not something specific to an RFC, it refers to all
kinds of &ldquo;larger decisions&rdquo;. For example, stabilization would be a
shepherd activity as well.</li>
</ul>
<h2 id="what-was-shepherds-20-and-how-is-this-different">What was shepherds 2.0 and how is this different?</h2>
<p>We&rsquo;ve also used the term shepherd to refer to a role that is moving
towards full lang team membership. That&rsquo;s different from this proposal
in that it is not tied to a specific topic area. But there is also
some interaction &ndash; for example, <strong>it&rsquo;s not clear that shepherds need
to be active lang team members</strong>.</p>
<p>I think it&rsquo;d be great to allow shepherds to be any person who is
sufficiently committed to help see something through. The main
requirement for a shepherd should be that they are able to give us
regular updates on the progress. Ideally, this would be done by
attending the lang team meeting. But that doesn&rsquo;t work for everyone &ndash;
whether it because of time zones, scheduling, or language barriers &ndash;
and so I think that any form of regular, asynchronous report would
work jsut fine.</p>
<p><strong>I think I would prefer for this proposal &ndash; and this kind of
&ldquo;role-specific shepherding&rdquo; &ndash; to entirely replace the &ldquo;provisional
member&rdquo; role on the lang team.</strong> It seems strictly better to me. Among
other things, it&rsquo;s naturally time-limited. Once the work item
completes, that gives us a chance to decide whether it makes sense for
someone to become a full member of the lang team, or perhaps try
shepherding another idea, or perhaps just part ways. I expect there
are a lot of people who have interest in working through a specific
feature but for whom there is little desire to be long-term members of
the lang team.</p>
<h2 id="how-do-i-get-a-shepherd-assigned-to-my-work-item">How do I get a shepherd assigned to my work item?</h2>
<p>Ultimately, I think this question is ill-posed: there is no way to
&ldquo;get&rdquo; a shepherd assigned to your work. Having the expectation that a
shepherd will be assigned runs smack into the problems of <a href="http://smallcultfollowing.com/babysteps/blog/2019/07/10/aic-unbounded-queues-and-lang-design/">unbounded
queues</a> and was, I think, a crucial flaw in the Shepherds 1.0 system.</p>
<p>Basically, the way a shepherd gets assigned in this scheme is roughly
the same as the way things &ldquo;get done&rdquo; today. You convince someone in
the lang team that the item is a priority, and they become the
shepherd. That convincing takes place through the existing channels:
nominated issues, discord or zulip, etc. It&rsquo;s not that I don&rsquo;t think
this is something else we should be thinking about, it&rsquo;s just that
it&rsquo;s something of an orthogonal problem.</p>
<p>My model is that shepherds are how we <em>quantify and manage the things
we are doing</em>. The question of &ldquo;what happens to all the existing
things&rdquo; is more a question of <em>how we select which things to do</em> &ndash;
and that&rsquo;s ultimately a priority call.</p>
<h2 id="ok-so-what-happens-to-all-the-existing-things">OK, so, what happens to all the existing things?</h2>
<p>That&rsquo;s a very good question. And one I don&rsquo;t intend to answer here, at
least not in full. That said, I do think this is an important problem
that we should think about.  I would like to be exposing more
&ldquo;insight&rdquo; into our overall priorities.</p>
<p>In my ideal world, we&rsquo;d have a list of projects that we are <strong>not</strong>
working on, grouped somewhat by how likely we are to work on them in
the future. This might then indicate ideas that we do <em>not</em> want to
pursue; ideas that we have mild interest in but which have a lot of
unknowns.  Ideas that we started working on but got blocked at some
point (hopefully with a report of what&rsquo;s blocking them). And so
forth. But that&rsquo;s all a topic for another post.</p>
<p>One other idea that I like is documenting on the website the &ldquo;areas of
interest&rdquo; for each of the lang team members (and possibly other folks)
who might be willing shepherds. This would help people figure out who
to reach out to.</p>
<h2 id="isnt-there-anything-i-can-do-to-help-move-topic-x-along">Isn&rsquo;t there anything I can do to help move Topic X along?</h2>
<p>This proposal does offer one additional option that hadn&rsquo;t formally
existing before. <strong>If you want to see something happen, you can offer
to shepherd it yourself &ndash; or in conjunction with a member of the lang
team.</strong> You could do this by pinging folks on discord, attending a
lang team meeting, or nominating an issue to bring it to the lang
team&rsquo;s attention.</p>
<h2 id="how-many-active-shepherds-can-we-have-then">How many active shepherds can we have then?</h2>
<p>It is important to emphasize that <strong>having a willing shepherd is not
necessarily enough to unblock a project</strong>. This is because, as I noted
above, assigning a shepherd is also a commitment on the part of the
lang-team &ndash; a commitment to review progress, resolve conflicts, and
keep up with things. That puts a kind of informal cap on how many
active things can be occurring, even if there are shepherds to
spare. This is particularly true for subtle things. This cap is
probably somewhat fundamental &ndash; even increasing the size of the lang
team wouldn&rsquo;t necessarily change it that much.</p>
<p>I don&rsquo;t know how many shepherds we should have at a time, I think
we&rsquo;ll have to work that out by experience, but I do think we should be
starting small, with a handful of items at a time. I&rsquo;d much rather we
are consistently making progress on a few things than spreading
ourselves too thin.</p>
<h2 id="expectations-for-a-shepherd">Expectations for a shepherd</h2>
<p>I think the expectations for a shepherd are as follows.</p>
<p>First, they should <strong>prepare updates for the lang team meeting</strong> on a
weekly basis (even if it&rsquo;s &ldquo;no update&rdquo;). This doesn&rsquo;t have to be a
long detailed write-up &ndash; even a &ldquo;no update&rdquo; suffices.</p>
<p>Second, when a design concern or conflict arises, they should help to
see it resolved. This means a few things. First and foremost, they
have to work to <strong>understand and document the considerations at
play</strong>, and be prepared to summarize those. (Note: they don&rsquo;t
necessarily have to do all this work themselves! I would like to see
us making more use of <a href="http://smallcultfollowing.com/babysteps/blog/2019/04/22/aic-collaborative-summary-documents/">collaborative summary documents</a>, which allow
us to share the work of documenting concerns.)</p>
<p>They should also work to help resolve the conflict, possibly by
scheduling one-off meetings or through other means. I won&rsquo;t go into
too much detail here because I think looking into how best to resolve
design conflicts is worthy of a separate post.</p>
<p>Finally, while this is not a firm expectation, it is expected that
shepherds will become experts in their area, and would thus be able to
give useful advice about similar topics in the future (even if they
are not actively shepherding that area anymore).</p>
<h2 id="expectations-from-the-lang-team">Expectations from the lang team</h2>
<p>I want to emphasize this part of things. I think the lang team suffers
from the problem of doing too many things at once. Part of agreeing
that someone should shepherd topic X, I think, is agreeing that we
should be making progress on topic X.</p>
<p>This implies that the team agrees to follow along with the status
updates and give moderate amounts of feedback when requested.</p>
<p>Of course, as the design progresses, it is natural that lang team
members will have concerns about various aspects. Just as today, we
operate on a consensus basis, so resolving those concerns is needed to
make progress. When an item has an active shepherd, though, that means
it is a priority, and this implies then that lang team members with
blocking concerns should make time to work with the shepherd and get
them resolved. (And, is always the case, this may mean accepting an
outcome that you don&rsquo;t personally agree with, if the rest of the team
is leaning the other way.)</p>
<h2 id="conclusion">Conclusion</h2>
<p>So, that&rsquo;s it! In the end, the specifics of what I propose are the following:</p>
<ul>
<li>We&rsquo;ll post on the <a href="https://github.com/rust-lang/lang-team/">lang team repository</a> the list of active shepherds and their assigned areas.</li>
<li>In order for a formal decision to be made (e.g., stabilization
proposal accepted, RFC accepted, etc), a shepherd must be assigned.
<ul>
<li>This happens at the lang team meeting. We should prepare a list of
factors to take into account when making this decision, but one of
the key ones is whether we agree as a team that this is something
that is high enough priority that we can devote the required
energy to seeing it progress.</li>
</ul>
</li>
<li>Shepherds will keep the lang-team updated on major developers and help to resolve
conflicts that arise, with the cooperation of the lang-team, as described above.
<ul>
<li>If a shepherd seems inactive for a long time, we&rsquo;ll discuss if
that&rsquo;s a problem.</li>
</ul>
</li>
</ul>
<h2 id="footnotes">Footnotes</h2>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>I could not find any single page that documents Rust&rsquo;s feature process from beginning to end. Seems like something we should fix. But what I mean by moving a feature forward is basically things like &ldquo;accepting an RFC&rdquo; or &ldquo;stabilzing an unstable feature&rdquo; &ndash; basically the formal decisions governed by the lang team.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/consensus" term="consensus" label="Consensus"/></entry><entry><title type="html">AiC: Unbounded queues and lang design</title><link href="https://smallcultfollowing.com/babysteps/blog/2019/07/10/aic-unbounded-queues-and-lang-design/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2019/07/10/aic-unbounded-queues-and-lang-design/</id><published>2019-07-10T00:00:00+00:00</published><updated>2019-07-10T00:00:00+00:00</updated><content type="html"><![CDATA[<p>I have been thinking about how language feature development works in
Rust<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>. I wanted to write a post about what I see as one of the
key problems: too much concurrency in our design process, without any
kind of &ldquo;back-pressure&rdquo; to help keep the number of &ldquo;open efforts&rdquo;
under control. This setup does enable us to get a lot of things done sometimes,
but I believe it also leads to a number of problems.</p>
<p>Although I don&rsquo;t make any proposals in this post, I am basically
advocating for changes to our process that can help us to stay focused
on a few active things at a time. Basically, incorporating a notion of
<strong>capacity</strong> such that, if we want to start something new, we either
have to finish up with something or else find a way to grow our
capacity.</p>
<h3 id="the-feature-pipeline">The feature pipeline</h3>
<p>Consider how a typical language feature gets introduced today:</p>
<ul>
<li><strong>Initial design</strong> in the form of an RFC. This is done by the <strong>lang team</strong>.</li>
<li><strong>Initial implementation</strong> is done. This work is overseen by the
<strong>compiler team</strong>, but often it is done by a volunteer contributor
who is not themselves affiliated.</li>
<li><strong>Documentation</strong> work is done, again often by a contributor,
overseen by the docs team.</li>
<li><strong>Experimentation in nightly</strong> takes places, often leading to
changes in the design.  (These changes have their own FCP periods.)</li>
<li>Finally, at some point, we <strong>stabilize</strong> the feature. This involves
a stabilization report that summarizes what has changed, known bugs,
what tests exist, and other details. This decision is made by the
<strong>lang team</strong>.</li>
</ul>
<p>At any given time, therefore, we have a number of features at each
point in the pipeline &ndash; some are being designed, some are waiting for
an implementor to show up, etc.</p>
<h3 id="today-we-have-unbounded-queues">Today we have unbounded queues</h3>
<p>One of the challenges is that the &ldquo;links&rdquo; between these pipeline are
effectively <strong>unbounded queues</strong>. It&rsquo;s not uncommon that we get an RFC
for a piece of design that &ldquo;seems good&rdquo;. The RFC gets accepted. But
nobody is really <strong>driving</strong> that work &ndash; as a result, it simply
languishes.  To me, the poster child for this is <a href="https://rust-lang.github.io/rfcs/0066-better-temporary-lifetimes.html">RFC 66</a> &ndash; a modest
change to our rules around the lifetime of temporary values. I still
think the RFC is a good idea (although its wording is very imprecise
and it needs to be rewritten to be made precise). But it&rsquo;s been
<a href="https://github.com/rust-lang/rust/issues/15023">sitting around unimplemented</a> since <strong>June of 2014</strong>. At this
point, is the original decision approving the RFC even still valid? (I
sort of think no, but we don&rsquo;t have a formal rule about that.)</p>
<h3 id="how-can-an-rfc-sit-around-for-5-years">How can an RFC sit around for 5 years?</h3>
<p>Why did this happen? I think the reason is pretty clear: the idea was
good, but it didn&rsquo;t align with any particular priority. We didn&rsquo;t have
resources lined up behind it. It needed somebody from the lang team
(probably me) to rewrite its text to be actionable and
precise<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>. It needed somebody from the compiler team (maybe me
again) to either write a PR or mentor somebody through it. And all
those people were busy doing other things. So why did we accept the PR
in the first place? Well, <strong>why wouldn&rsquo;t we?</strong> Nothing in the process
states that we should consider available resources when making an RFC
decision.</p>
<h3 id="unbounded-queues-lead-to-confusion-for-users">Unbounded queues lead to confusion for users</h3>
<p>So why does it matter when things sit around? I think it has a number
of negative effects. The most obvious is that it sends really
confusing signals to people trying to follow along with Rust&rsquo;s
development. It&rsquo;s really hard to tell what the current priorities are;
it&rsquo;s hard to tell when a given feature might actually appear. Some of
this we can help resolve just by better labeling and documentation.</p>
<h3 id="unbounded-queues-make-it-harder-for-teams">Unbounded queues make it harder for teams</h3>
<p>But there are other, more subtle effects. Overall, it makes it much
harder for the team itself to stay organized and focused and that in
turn can create a lot of stress. Stress in turn magnifies all other
problems.</p>
<p>How does it make it harder to stay organized? Under the current setup,
people can add new entries into any of these queues at basically any
time. This can come in many forms, such as new RFCs (new design work
and discussion), proposed changes to an existing design (new design or
implementation work), etc.</p>
<p>Just having a large number of existing issues means that, in a very
practical sense, it becomes challenging to follow GitHub notifications
or stay on top of all the things going on. I&rsquo;ve lost count of the
number of attempts I&rsquo;ve made at this personally.</p>
<p>Finally, the fact that design work stretches over such long periods
(frequently years!) makes it harder to form stable communities of
people that can dig deeply into an issue, develop a rapport, and reach
a consensus.</p>
<h3 id="leaving-room-for-serendipity">Leaving room for serendipity?</h3>
<p>Still, there&rsquo;s a reason that we setup the system the way we did. This
setup can really be a great fit for an open source project. After all,
in an open source project, it can be <strong>really hard for us to figure
out how many resources we actually have</strong>. It&rsquo;s certainly more than
the number of folks on the teams. It happens pretty regularly that
people appear out of the blue with an amazing PR implementing some
feature or other &ndash; and we had no idea they were working on it!</p>
<p>In the <a href="https://www.youtube.com/watch?v=J9OFQm8Qf1I">2018 RustConf keynote</a>, we talked about the contrast between
<strong>OSS by serendipity</strong> and <strong>OSS on purpose</strong>. We were highlighting
exactly this tension: on the one hand, Rust is a product, and like any
product it needs direction. But at the same time, we want to enable
people to contribute as much as we can.</p>
<h3 id="reviewing-as-the-limited-resource">Reviewing as the limited resource</h3>
<p>Still, while the existing setup helps ensure that there are many
opportunities for people to get involved, it also means that people
who come with a new idea, PR, or whatever may wind up waiting a long
time to get a response. Often the people who are supposed to answer
are just busy doing other things. Sometimes, there is a (often
unspoken) understanding that a given issue is just not high enough
priority to worry about.</p>
<p>In an OSS project, therefore, I think that the right way to measure
capacity is in terms of <strong>reviewer bandwidth</strong>. Here I mean &ldquo;reviewer&rdquo;
in a pretty general way. It might be someone who reviews a PR, but it
might also be a lang team member who is helping to drive a particular
design forward.</p>
<h3 id="leaving-room-for-new-ideas">Leaving room for new ideas?</h3>
<p>One other thing I&rsquo;ve noticed that&rsquo;s worth highlighting is that,
sometimes, hard ideas just need time to bake. Trying to rush something
through the design process can be a bad idea.</p>
<p>Consider specialization: On the one hand, this feature was <a href="https://github.com/rust-lang/rfcs/pull/1210">first
proposed</a> in July of <strong>2015</strong>. We had a lot of really
important debate at the time about the importance of parametricity and
so forth. We have an initial implementation. But there was one key
issue that never got satisfactorily resolved, a technical soundness
concern around lifetimes and traits. As such, the issue has sat around
&ndash; it would get periodically discussed but we never came to a
satisfactory conclusion. Then, in Feb of <strong>2018</strong>, <a href="http://smallcultfollowing.com/babysteps/blog/2018/02/09/maximally-minimal-specialization-always-applicable-impls/">I had an idea</a>
which <a href="http://aturon.github.io/tech/2018/04/05/sound-specialization/">aturon then extended</a> in April. It <em>seems</em> like these ideas
have basically solved the problem, but we&rsquo;ve been busy in the meantime
and haven&rsquo;t had time to follow up.</p>
<p>This is a tricky case: maybe if we had tried to push specialization
all the way to stabilization, we would have had these same ideas. But
maybe we wouldn&rsquo;t have. Overall, I think that deciding to wait has
worked out reasonably well for us, but probably not <em>optimally</em>. I
think in an ideal world we would have found some <em>useful subset</em> of
specialization that we could stabilize, while deferring the tricky
questions.</p>
<h3 id="tabling-as-an-explicit-action">Tabling as an explicit action</h3>
<p>Thinking about specialization leads to an observation: one of the
things we&rsquo;re going to have to figure out is how to draw good
boundaries so that we can push out a useful subset of a feature (an
&ldquo;MVP&rdquo;, if you will) and then leave the rest for later. Unlike today,
though, I think should be an explicit process, where we take the time
to document the problems we still see and our current understanding of
the space, and then explicitly &ldquo;table&rdquo; the remainder of the work for
another time.</p>
<h3 id="people-need-help-to-set-limits">People need help to set limits</h3>
<p>One of the things I think we should put into our system is some kind
of <strong>hard cap</strong> on the number of things you can do at any given time.
I&rsquo;d like this cap to be pretty small, like one or two. This will be
frustrating. It will be tempting to say &ldquo;sure I&rsquo;m working on X, but I
can make a little time for Y too&rdquo;. It will also slow us down a bit.</p>
<p>But I think that&rsquo;s ok. We can afford to do a few less things. Or, if
it seems like we can&rsquo;t, that&rsquo;s probably a sign that we need to grow
that capacity: find more people we trust to do the reviews and lead
the process. If we can&rsquo;t do that, then we have to adjust our ambitions.</p>
<p>In other words, in the absence of a cap, it is very easy to &ldquo;stretch&rdquo;
to achieve our goals. That&rsquo;s what we&rsquo;ve done often in the past. But
you can only stretch so far and for so long.</p>
<h3 id="conclusion">Conclusion</h3>
<p>As I wrote in the beginning, I&rsquo;m not making any proposals in this
post, just sharing my current thoughts. I&rsquo;d like to hear if you think
I&rsquo;m onto something here, or heading in the wrong direction. Here is a
link to the <a href="https://internals.rust-lang.org/t/aic-adventures-in-consensus/9843">Adventures in Consensus thread on internals</a>.</p>
<p>One thing that has been pointed out to me is that these ideas resemble
a number of management philosophies, most notably <a href="https://en.wikipedia.org/wiki/Kanban">kanban</a>. I don&rsquo;t
have much experience with that personally but it makes sense to me
that others would have tried to tackle similar issues.</p>
<h3 id="footnotes">Footnotes</h3>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>I&rsquo;m coming at this from the perspective of the lang team, but I think a lot of this applies more generally.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>For that matter, it would be helpful if there were a spec of the current behavior for it to build off of.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/consensus" term="consensus" label="Consensus"/></entry><entry><title type="html">Async-await status report #2</title><link href="https://smallcultfollowing.com/babysteps/blog/2019/07/08/async-await-status-report-2/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2019/07/08/async-await-status-report-2/</id><published>2019-07-08T00:00:00+00:00</published><updated>2019-07-08T00:00:00+00:00</updated><content type="html"><![CDATA[<p>I wanted to give an update on the status of the &ldquo;async-await
foundations&rdquo; working group. This post aims to cover three things:</p>
<ul>
<li>the &ldquo;async await MVP&rdquo; that we are currently targeting;</li>
<li>how that fits into the bigger picture;</li>
<li>and how you can help, if you&rsquo;re so inclined;</li>
</ul>
<h2 id="current-target-async-await-mvp">Current target: async-await MVP</h2>
<p>We are currently working on stabilizing what we call the <strong>async-await
MVP</strong> &ndash; as in, &ldquo;minimal viable product&rdquo;. As the name suggests, the
work we&rsquo;re doing now is basically the minimum that is needed to
&ldquo;unlock&rdquo; async-await. After this work is done, it will be easier to
build async I/O based applications in Rust, though a number of rough
edges remain.</p>
<p>The MVP consists of the following pieces:</p>
<ul>
<li>the <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code>Future</code></a> trait, which defines the core future protocol (stabilized in <a href="https://blog.rust-lang.org/2019/07/04/Rust-1.36.0.html">1.36.0</a>!);</li>
<li>basic async-await syntax;</li>
<li>a &ldquo;first edition&rdquo; of <a href="https://rust-lang.github.io/async-book/index.html">the &ldquo;async Rust&rdquo; book</a>.</li>
</ul>
<h3 id="the-future-trait">The future trait</h3>
<p>The first of these bullets, the future trait, was stabilized in the
<a href="https://blog.rust-lang.org/2019/07/04/Rust-1.36.0.html">1.36.0</a> release. This is important because the <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code>Future</code></a> trait is the
core building block for the whole Async I/O ecosystem. Having a stable
future trait means that we can begin the process of consolidating the
ecosystem around it.</p>
<h3 id="basic-async-await-syntax">Basic async-await syntax</h3>
<p>Now that the future trait is stable, the next step is to stabilize the
basic &ldquo;async-await&rdquo; syntax. We are presently shooting to stabilize
this in 1.38. We&rsquo;ve finished the largest work items, but there are
still a number of things left to get done before that date &ndash; if
you&rsquo;re interested in helping out, see the &ldquo;how you can help&rdquo; section
at the end of this post!</p>
<p>The current support we are aiming to stabilize permits <code>async fn</code>, but
only outside of traits and trait implementations. This means that you
can write free functions like this one:<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// When invoked, returns a future that (once awaited) will yield back a result:
</span></span></span><span class="line"><span class="cl"><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">process</span><span class="p">(</span><span class="n">data</span>: <span class="nc">TcpStream</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;</span><span class="p">(),</span><span class="w"> </span><span class="nb">Box</span><span class="o">&lt;</span><span class="k">dyn</span><span class="w"> </span><span class="n">Error</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">buf</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[</span><span class="mi">0</span><span class="k">u8</span><span class="p">;</span><span class="w"> </span><span class="mi">1024</span><span class="p">];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Await data from the stream:
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">len</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">reader</span><span class="p">.</span><span class="n">read</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">buf</span><span class="p">).</span><span class="k">await</span><span class="o">?</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>or inherent methods:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">MyType</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Same as above, but defined as a method on `MyType`:
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">async</span><span class="w"> </span><span class="k">fn</span> <span class="nf">process</span><span class="p">(</span><span class="n">data</span>: <span class="nc">TcpStream</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;</span><span class="p">(),</span><span class="w"> </span><span class="nb">Box</span><span class="o">&lt;</span><span class="k">dyn</span><span class="w"> </span><span class="n">Error</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>You can also write async blocks, which generate a future &ldquo;in place&rdquo;
without defining a separate function. These are particularly useful to
pass as arguments to helpers like <a href="https://docs.rs/runtime/0.3.0-alpha.5/runtime/fn.spawn.html"><code>runtime::spawn</code></a>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">data</span>: <span class="nc">TcpStream</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">runtime</span>::<span class="n">spawn</span><span class="p">(</span><span class="k">async</span><span class="w"> </span><span class="k">move</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">buf</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[</span><span class="mi">0</span><span class="k">u8</span><span class="p">;</span><span class="w"> </span><span class="mi">1024</span><span class="p">];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">len</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">reader</span><span class="p">.</span><span class="n">read</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">buf</span><span class="p">).</span><span class="k">await</span><span class="o">?</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">})</span><span class="w">
</span></span></span></code></pre></div><p>Eventually, we plan to permit <code>async fn</code> in other places, but there
are some complications to be resolved first, as will be discussed
shortly.</p>
<h3 id="the-async-book">The async book</h3>
<p>One of the goals of this stabilization is that, once async-await
syntax becomes available, there should be <strong>really strong
documentation to help people get started</strong>. To that end, we&rsquo;re
rejuvenating <a href="https://rust-lang.github.io/async-book/index.html">the &ldquo;async Rust&rdquo; book</a>. This book covers the nuts
and bolts of Async I/O in Rust, ranging from simple examples with
<code>async fn</code> all the way down to the details of how the future trait
works, writing your own executors, and so forth. Take a look!</p>
<p>(Eventually, I expect some of this material may make its way into more
standard books like <a href="https://doc.rust-lang.org/book/">The Rust Programming Language</a>, but in the
meantime we&rsquo;re evolving it separately.)</p>
<h2 id="future-work-the-bigger-picture">Future work: the bigger picture</h2>
<p>The current stabilization push, as I mentioned above, is aimed at
getting an MVP stabilized &ndash; just enough to enable people to run off
and start to build things. So you&rsquo;re probably wondering, what are some
of the things that come next? Here is a (incomplete) list of possible
future work:</p>
<ul>
<li><strong>A core set of async traits and combinators.</strong> Basically a 1.0
version of the <a href="https://github.com/rust-lang-nursery/futures-rs">futures-rs repository</a>, offering key interfaces
like <code>AsyncRead</code>.</li>
<li><strong>Better stream support.</strong> The <a href="https://github.com/rust-lang-nursery/futures-rs">futures-rs repository</a> contains
a <code>Stream</code> trait, but there remains some &ldquo;support work&rdquo; to make it
better supported. This may include <a href="https://boats.gitlab.io/blog/post/for-await-i/">some form of for-await
syntax</a> (although that is not a given).</li>
<li><strong>Generators and async generators.</strong> The same core compiler
transform that enables async await should enable us to support
Python- or JS-like generators as a way to write iterators. Those
same generators can then be made asynchronous to produce streams of
data.</li>
<li><strong>Async fn in traits and trait impls.</strong> Writing generic crates and
interfaces that work with <code>async fn</code> is possible in the MVP, but not
as clean or elegant as it could be. Supporting <code>async fn</code> in traits
is an obvious extension to make that nicer, though we have to figure
out all of the interactions with the rest of the trait system.</li>
<li><strong>Async closures.</strong> We would like to support the obvious <code>async ||</code>
syntax that would generate a closure. This may require tinkering
with the <code>Fn</code> trait hierarchy.</li>
</ul>
<h2 id="how-you-can-get-involved">How you can get involved</h2>
<p>There&rsquo;s been a lot of great work on the <code>async fn</code> implementation
since my first post &ndash; we&rsquo;ve closed over <a href="https://github.com/rust-lang/rust/issues?q=is%3Aissue+label%3AAsyncAwait-Blocking+is%3Aclosed">40 blocker issues</a>!  I want
to give a special shout out to the folks who worked on those
issues:<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></p>
<ul>
<li><strong>davidtwco</strong> reworked the desugaring so that the drop order for
parameters in an <code>async fn</code> and <code>fn</code> is analagous, and then
heroically fixed a number of minor bugs that were filed as fallout
from this change.</li>
<li><strong>tmandry</strong> dramatically reduced the size of futures at runtime.</li>
<li><strong>gilescope</strong> improved a number of error messages and helped to reduce
errors.</li>
<li><strong>matthewjasper</strong> reworked some details of the compiler transform to
solve a large number of ICEs.</li>
<li><strong>doctorn</strong> fixed an ICE when <code>await</code> was used in inappropriate places.</li>
<li><strong>centril</strong> has been helping to enumerate tests and generally work on
triage work.</li>
<li><strong>cramertj</strong> implemented the <code>await</code> syntax, wrote a bunch of tests,
and, of course, did all of the initial implementation work.</li>
<li>and hey, I extended the region inferencer to support multiple
lifetime parameters. I guess I get some credit too. =)</li>
</ul>
<p>If you&rsquo;d like to help push <code>async fn</code> over the finish line, take a
look at our <a href="https://github.com/rust-lang/rust/labels/AsyncAwait-Blocking">list of blocking issues</a>. Anything that is not
assigned is fair game! Just find an issue you like that is not
assigned and use <a href="https://github.com/rust-lang/triagebot/wiki/Assignment"><code>@rustbot claim</code></a> to claim it. You can find
out more about how our working group works on <a href="https://github.com/rust-lang/compiler-team/tree/master/working-groups/async-await">the async-await working
group page</a>. In particular, that page includes a link to the
<a href="https://calendar.google.com/calendar/r/eventedit/copy/NjQzdWExaDF2OGlqM3QwN2hncWI5Y2o1dm5fMjAxOTA2MTFUMTcwMDAwWiA2dTVycnRjZTZscnR2MDdwZmkzZGFtZ2p1c0Bn/bmlrb21hdHNha2lzQGdtYWlsLmNvbQ?scp=ALL&amp;pli=1&amp;sf=true">calendar event</a> for our weekly meeting, which takes place in the
<a href="https://rust-lang.zulipchat.com/#narrow/stream/187312-wg-async-foundations">the <code>#wg-async-foundations</code> channel on the rust-lang Zulip</a> &ndash;
the next meeting is tomorrow (Tuesday)!. But feel free to drop in any
time with questions.</p>
<h2 id="footnotes">Footnotes</h2>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Sadly, it seems like [rouge] hasn&rsquo;t been updated yet to highlight the async or await keywords. Or maybe I just don&rsquo;t understand how to upgrade it. =)
[rouge]: <a href="https://github.com/rouge-ruby/rouge">https://github.com/rouge-ruby/rouge</a>&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>I culled this list by browsing the closed issues and who they were assigned to. I&rsquo;m sorry if I forgot someone or minimized your role! Let me know and I&rsquo;ll edit the post. &lt;3&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/asyncawait" term="asyncawait" label="AsyncAwait"/></entry><entry><title type="html">AiC: Language-design team meta working group</title><link href="https://smallcultfollowing.com/babysteps/blog/2019/04/26/aic-language-design-team-meta-working-group/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2019/04/26/aic-language-design-team-meta-working-group/</id><published>2019-04-26T00:00:00+00:00</published><updated>2019-04-26T00:00:00+00:00</updated><content type="html"><![CDATA[<p>On internals, I <a href="https://internals.rust-lang.org/t/announcing-lang-team-meta-working-group/9900">just announced</a> the formation of the
language-design team meta working group. The role of the meta working
group is to figure out how other language-design team working groups
should work. The plan is to begin by enumerating some of our goals &ndash;
the problems we aim to solve, the good things we aim to keep &ndash; and
then move on to draw up more details plans. I expect this discussion
will intersect the RFC process quite heavily (at least when it comes
to language design changes). Should be interesting! It&rsquo;s all happening
in the open, and a major goal of mine is for this to be easy to follow
along with from the outside &ndash; so if talking about talking is your
thing, you should <a href="https://internals.rust-lang.org/t/announcing-lang-team-meta-working-group/9900">check it out</a>.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/consensus" term="consensus" label="Consensus"/></entry><entry><title type="html">AiC: Collaborative summary documents</title><link href="https://smallcultfollowing.com/babysteps/blog/2019/04/22/aic-collaborative-summary-documents/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2019/04/22/aic-collaborative-summary-documents/</id><published>2019-04-22T00:00:00+00:00</published><updated>2019-04-22T00:00:00+00:00</updated><content type="html"><![CDATA[<p>In my <a href="http://smallcultfollowing.com/babysteps/blog/2019/04/19/aic-adventures-in-consensus/">previous post</a>, I talked about the idea of <em>mapping the
solution space</em>:</p>
<blockquote>
<p>When we talk about the RFC process, we always emphasize that the point
of RFC discussion <strong>is not to select the best answer</strong>; rather, the
point is to <strong>map the solution space</strong>. That is, to explore what the
possible tradeoffs are and to really look for alternatives.  This
mapping process also means exploring the ups and downs of the current
solutions on the table.</p>
</blockquote>
<p><strong>One of the challenges I see with how we often do design is that this
&ldquo;solution space&rdquo; is actually quite implicit.</strong> We are exploring it
through comments, but each comment is only tracing out one path
through the terrain. I wanted to see if we could try to represent the
solution space explicitly. This post is a kind of &ldquo;experience report&rdquo;
on one such experiment, what I am calling a <strong>collaborative summary
document</strong> (in contrast to the more standard <strong>summary comment</strong> that
we often do).</p>
<h3 id="the-idea-a-collaborative-summary-document">The idea: a collaborative summary document</h3>
<p>I&rsquo;ll get into the details below, but the basic idea was to create a
shared document that tried to present, in a neutral fashion, the
arguments for and against a particular change. I asked the people to
stop commenting in the thread and instead read over the document, look
for things they disagreed with, and offer suggestions for how it could
be improved.</p>
<p>My hope was that we could not only get a thorough summary from the
process, but also do something deeper: change the <em>focus</em> of the
conversation from &ldquo;advocating for a particular point of view&rdquo; towards
&ldquo;trying to ensure a complete and fair summary&rdquo;. I figured that after
this period was done, people were likely go back to being advocates
for their position, but at least for some time we could try to put
those feelings aside.</p>
<h3 id="so-how-did-it-go">So how did it go?</h3>
<p>Overall, I felt very positive about the experience and I am keen to
try it again. I think that something like &ldquo;collaborative summary
documents&rdquo; could become a standard part of our process. Still, I think
it&rsquo;s going to take some practice trying this a few times to figure out
the best structure. Moreover, I think it is not a silver bullet: to
realize the full potential, we&rsquo;re going to have to make other changes
too.</p>
<h3 id="what-i-did-in-depth">What I did in depth</h3>
<p>What I did more specifically was to <a href="https://paper.dropbox.com/doc/Future-proof-the-Futures-API-Summary--AbplHExNn34jm1~y2i02FYARAg-JODniiQQQcNhHD7iNZ8iM">create a Dropbox Paper
document</a>. <a href="https://paper.dropbox.com/doc/Future-proof-the-Futures-API-Summary--AbplHExNn34jm1~y2i02FYARAg-JODniiQQQcNhHD7iNZ8iM">This document</a> contained my best effort at
summarizing the issue at hand, but it was not meant to be just my
work. The idea was that we would all jointly try to produce the best
summary we could.</p>
<p>After that, I <a href="https://github.com/rust-lang/rust/pull/59119#issuecomment-473655294">made an announcement</a> on the original thread asking
people to participate in the document. Specifically, <a href="https://paper.dropbox.com/doc/Future-proof-the-Futures-API-Summary--AbpsrNFMirDHgOtZimF11AEUAg-JODniiQQQcNhHD7iNZ8iM#:uid=739200718850749032543986&amp;h2=This-is-an-experiment">as the document
states</a>, the idea was for people to do something like this:</p>
<ul>
<li>Read the document, looking for things they didn&rsquo;t agree with or felt were unfairly represented.</li>
<li>Leave a comment explaining their concern; or, better, supplying alternate wording that they <em>did</em> agree with
<ul>
<li>The intention was always to preserve what they felt was the sense
of the initial comment, but to make it more precise or less judgemental.</li>
</ul>
</li>
</ul>
<p>I was then playing the role of editor, taking these comments and
trying to incorporate them into the whole. The idea was that, as
people edited the document, we would gradually approach a <strong>fixed point</strong>,
where there was nothing left to edit.</p>
<h3 id="structure-of-the-shared-document">Structure of the shared document</h3>
<p>Initially, when I created the document, I structured it into two
sections &ndash; basically &ldquo;pro&rdquo; and &ldquo;con&rdquo;. The issue at hand was a
particular change to the Futures API (the details don&rsquo;t matter
here). In this case, the first section advocated <strong>for</strong> the change,
and the second section advocated against it. So, something like this
(for a fictional problem):</p>
<blockquote>
<p><strong>Pro:</strong></p>
<p>We should make this change because of X and Y. The options
we have now (X1, X2) aren&rsquo;t satisfying because of problem Z.</p>
<p><strong>Con:</strong></p>
<p>This change isn&rsquo;t needed. While it would make X easier, there are
already other useful ways to solve that problem (such as X1, X2).
Similarly, the goals of isn&rsquo;t very desirable in the first
place because of A, B, and C.</p>
</blockquote>
<p>I quickly found this structure rather limiting. It made it hard to
compare the arguments &ndash; as you can see here, there are often
&ldquo;references&rdquo; between the two sections (e.g., the con section refers to
the argument X and tries to rebut it). Trying to compare and consider
these points required a lot of jumping back and forth between the
sections.</p>
<h3 id="using-nested-bullets-to-match-up-arguments">Using nested bullets to match up arguments</h3>
<p>So I decided to restructure the document to integrate the arguments
for and against. I created nesting to show when one point was directly
in response to another. For example, it might read like this (this is
not an actual point; those were much more detailed):</p>
<ul>
<li><strong>Pro:</strong> We should make this change because of X.
<ul>
<li><strong>Con:</strong> However, there is already the option of X1 and X2 to satisfy that use-case.
<ul>
<li><strong>Pro:</strong> But X1 and X2 suffer from Z.</li>
</ul>
</li>
</ul>
</li>
<li><strong>Pro:</strong> We should make this change because of Y and Z.
<ul>
<li><strong>Con:</strong> Those goals aren&rsquo;t as important because of A, B, and C.</li>
</ul>
</li>
</ul>
<p>Furthermore, I tried to make the first bullet point a bit special &ndash;
it would be the one that encapsulated the <strong>heart</strong> of the dispute,
from my POV, with the later bullet points getting progressively more
into the weeds.</p>
<h3 id="nested-bullets-felt-better-but-we-can-do-better-still-i-bet">Nested bullets felt better, but we can do better still I bet</h3>
<p>I definitely preferred the structure of nested bullets to the original
structure, but it didn&rsquo;t feel perfect. For one thing, it requires me
to summarize each argument into a single paragraph. Sometimes this
felt &ldquo;squished&rdquo;. I didn&rsquo;t love the repeated &ldquo;pro&rdquo; and &ldquo;con&rdquo;. Also,
things don&rsquo;t always fit neatly into a <em>tree</em>; sometimes I had to
&ldquo;cross-reference&rdquo; between points on the tree (e.g., referencing
another bullet that had a detailed look at the trade-offs).</p>
<p><strong>If I were to do this again,</strong> I might tinker a bit more with the
format. The most extreme option would be to try and use a &ldquo;wiki-like&rdquo;
format.  This would allow for free inter-linking, of course, and would
let us hide details into a recursive structure. But I worry it&rsquo;s <em>too
much</em> freedom.</p>
<h3 id="adding-narratives-on-top-of-the-core-facts">Adding &ldquo;narratives&rdquo; on top of the &ldquo;core facts&rdquo;</h3>
<p>One thing I found that surprised me a bit: the summary document aimed
to summarize the &ldquo;core facts&rdquo; of the discussion &ndash; in so doing, I
hoped to summarize the two sides of the argument. But I found that
<strong>facts alone cannot give a &ldquo;complete&rdquo; summary:</strong> to give a complete
summary, you also need to present those facts &ldquo;in context&rdquo;. Or, put
another way, you also need to explain the <em>weighting</em> that each side
puts on the facts.</p>
<p>In other words, the document did a good job of enumerating the various
concerns and &ldquo;facets&rdquo; of the discussion. But it didn&rsquo;t do a good job
of explaining <strong>why</strong> you might fall on one side or the other.</p>
<p>I tried to address this by <a href="https://github.com/rust-lang/rust/pull/59119#issuecomment-474444350">crafting a &ldquo;summary comment&rdquo;</a> on the main
thread. This comment had a very specific form. It begin by trying to identify
the &ldquo;core tradeoff&rdquo; &ndash; the crux of the disagreement:</p>
<blockquote>
<p>So the core tradeoff here is this:</p>
<ul>
<li>By leaving the design as is, we keep it as simple and ergonomic as it can be;
<ul>
<li><strong>but</strong>, if we wish to pass <strong>implicit</strong> parameters to the future when polling, we must use TLS.</li>
</ul>
</li>
</ul>
</blockquote>
<p>It then identifies some of the &ldquo;facets&rdquo; of the space which different people weight
in different ways:</p>
<blockquote>
<p>So, which way you fall will depend on</p>
<ul>
<li>how important you think it is for <code>Future</code> to be ergonomic
<ul>
<li>and naturally how much of an ergonomic hit you believe this to be</li>
<li>how likely you think it is for us to want to add implicit parameters</li>
<li>how much of a problem you think it is to use TLS for those implicit parameters</li>
</ul>
</li>
</ul>
</blockquote>
<p><strong>And then it tried to tell a series of &ldquo;narratives&rdquo;.</strong> Basically to
tell the <strong>story</strong> of each group that was involved and <strong>why</strong> that
led them to assign different weights to those points above. Those
weights in turn led to a different opinion on the overall issue.</p>
<p>For example:</p>
<blockquote>
<p>I think a number of people feel that, by now, between Rust and other
ecosystems, we have a pretty good handle on what sort of data we
want to thread around and what the best way is to do it. Further,
they feel that TLS or passing parameters explicitly is the best
solution approach for those cases. Therefore, they prefer to leave
the design as is, and keep things simple. (More details in the doc,
of course.)</p>
</blockquote>
<p>Or, on the other side:</p>
<blockquote>
<p>Others, however, feel like there is additional data they want to
pass implicitly and they do not feel convinced that TLS is the best
choice, and that this concern outweights the ergonomic
costs. Therefore, they would rather adopt the PR and keep our
options open.</p>
</blockquote>
<p>Finally, it&rsquo;s worth noting that there aren&rsquo;t always just two sides. In
fact, in this case I identified a third camp:</p>
<blockquote>
<p>Finally, I think there is a third position that says that this
controversy just isn&rsquo;t that important. The performance hit of TLS,
if you wind up using it, seems to be minimal. Similarly, the
clarity/ergonomics of <code>Future</code> are not as criticial, as users who
write <code>async fn</code> will not implement it directly, and/or perhaps the
effect is not so large. These folks probably could go either way,
but would mostly like us to stop debating it and start building
stuff. =)</p>
</blockquote>
<p>One downside of writing the narratives in a standard summary comment
was that it was not &ldquo;part of&rdquo; the main document. In fact, it feels to
me like these narratives are a pretty key part of the whole thing.  In
fact, it was only once I added these narratives that I really felt I
started to <em>understand</em> why one might choose one way or the other when
it came to this decision.</p>
<p><strong>If I were to do this again,</strong> I would make <strong>narratives</strong> more of a
first-place entity in the document itself. I think I would also focus
on some other &ldquo;meta-level reasoning&rdquo;, such as <strong>fears and risks</strong>. I
think it&rsquo;s worth thinking, for any given decision, &ldquo;what if we make
the wrong call&rdquo; &ndash; e.g., in this case, what happens if we decide <em>not</em>
to future proof, but then we regret it; in contrast, what happens if
we decide to <em>add</em> future proofing, but we never use it.</p>
<h3 id="we-never-achieved-shared-ownership-of-the-summary">We never achieved &ldquo;shared ownership&rdquo; of the summary</h3>
<p>One of my goals was that we could, at least for a moment, disconnect
people from their particular position and turn their attention towards
the goal of achieving a shared and complete summary. I didn&rsquo;t feel
that we were very succesful in this goal.</p>
<p>For one thing, most participants simply left comments on parts they
disagreed with; they didn&rsquo;t themselves suggest alternate wording. That
meant that I personally had to take their complaint and try to find
some &ldquo;middle ground&rdquo; that accommodated the concern but preserved the
original point. This was stressful for me and a lot of work. <strong>More
importantly, it meant that most people continued to interact with the
document as <em>advocates</em> for their point-of-view, rather than trying to
step back and advocate for the completeness of the summary.</strong></p>
<p>In other words: when you see a sentence you disagree with, it is easy
to say that you disagree with it. It is much harder to rephrase it in
a way that you <em>do</em> agree with &ndash; but which still preserves (what you
believe to be) the original intent. Doing so requires you to think
about what the other person likely meant, and how you can preserve
that.</p>
<p>However, one possible reason that people may have been reluctant to
offer suggestions is that, often, it was hard to make &ldquo;small edits&rdquo;
that addressed people&rsquo;s concerns. Especially early on, I found that,
in order to address some comment, I would have to make larger
restructurings. For example, taking a small sentence and expanding it
to a bullet point of its own.</p>
<p>Finally, some people who were active on the thread didn&rsquo;t participate
in the doc. Or, if they did, they did so by leaving comments on the
original GitHub thread. This is not surprising: I was asking people to
do something new and unfamiliar. Also, this whole process played out
relatively quickly, and I suspect some people just didn&rsquo;t even <em>see</em>
the document before it was done.</p>
<p><strong>If I were to do this again,</strong> I would want to start it earlier in
the process. I would also want to consider synchronous meetings, where
we could go try to process edits as a group (but I think it would take
some thought to figure out how to run such a meeting).</p>
<p>In terms of functioning asynchronously, I would probably change to use
a Google Doc instead of a Dropbox Paper. Google Docs have a better
workflow for suggesting edits, I believe, as well, as a richer permissions
model.</p>
<p>Finally, I would try to draw a harder line in trying to get people to
&ldquo;own&rdquo; the document and suggest edits of their own. I think the
challenge of trying to neutrally represent someone else&rsquo;s point of
view is pretty powerful.</p>
<h3 id="concluding-remarks">Concluding remarks</h3>
<p>Conducting this exercise taught me some key lessons:</p>
<ul>
<li>We should experiment with the best way to <em>describe</em> the
back-and-forth (I found it better to put closely related points
together, for example, rather than grouping the arguments into &lsquo;pro
and con&rsquo;).</li>
<li>We should include not only the &ldquo;core facts&rdquo; but also the
&ldquo;narratives&rdquo; that weave those facts together.</li>
<li>We should do this summary process earlier and we should try to find
better ways to encourage participation.</li>
</ul>
<p>Overall, I felt very good about the idea of &ldquo;collaborative summary
documents&rdquo;. I think they are a clear improvement over the &ldquo;summary
comment&rdquo;, which was the prior state of the art.</p>
<p>If nothing else, the quality of the summary itself was greatly
improved by being a collaborative document. I felt like I had a pretty
good understanding of the question when I started, but getting
feedback from others on the things they felt I misunderstood, or just
the places where my writing was unclear, was very useful.</p>
<p>But of course my aims run larger. I hope that we can change how design
work <em>feels</em>, by encouraging all of us to deeply understand the design
space (and to understand what motivates the other side). My experiment
with this summary document left me feeling pretty convinced that it
could be a part of the solution.</p>
<h3 id="feedback">Feedback</h3>
<p>I&rsquo;ve created a discussion thread on <a href="https://internals.rust-lang.org/t/aic-adventures-in-consensus/9843">the internals forum</a>
where you can leave questions or comments. I&rsquo;ll definitely read them
and I will try to respond, though I often get overwhelmed<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>, so
don&rsquo;t feel offended if I fail to do so.</p>
<h3 id="footnotes">Footnotes</h3>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>So many things, so little time.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/consensus" term="consensus" label="Consensus"/></entry><entry><title type="html">AiC: Adventures in consensus</title><link href="https://smallcultfollowing.com/babysteps/blog/2019/04/19/aic-adventures-in-consensus/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2019/04/19/aic-adventures-in-consensus/</id><published>2019-04-19T00:00:00+00:00</published><updated>2019-04-19T00:00:00+00:00</updated><content type="html"><![CDATA[<p>In the talk I gave at Rust LATAM, <a href="https://nikomatsakis.github.io/rust-latam-2019/#92">I said</a> that the Rust project has
always emphasized <strong>finding the best solution, rather than winning the
argument</strong>. I think this is one of our deepest values.  It&rsquo;s also one
of the hardest for us to uphold.</p>
<p>Let&rsquo;s face it &ndash; when you&rsquo;re having a conversation, it&rsquo;s easy to get
attached to specific proposals. It&rsquo;s easy to have those proposals
change from &ldquo;Option A&rdquo; vs &ldquo;Option B&rdquo; to &ldquo;<strong>my</strong> option&rdquo; and &ldquo;<strong>their</strong>
option&rdquo;. Once this happens, it can be very hard to let <strong>them</strong> &ldquo;win&rdquo;
&ndash; even if you know that both options are quite reasonable.</p>
<p>This is a problem I&rsquo;ve been thinking a lot about lately. So I wanted
to start an irregular series of blog posts entitled &ldquo;Adventures in
consensus&rdquo;, or AiC for short. These posts are my way of exploring the
topic, and hopefully getting some feedback from all of you while I&rsquo;m
at it.</p>
<p>This first post dives into what a phrase like &ldquo;finding the best
solution&rdquo; even means (is there a best?) as well as the mechanics of
how one might go about deciding if you really have the &ldquo;best&rdquo;
solution. Along the way, we&rsquo;ll see a few places where I think our
current process could do better.</p>
<h3 id="beyond-tradeoffs">Beyond tradeoffs</h3>
<p>Part of the challenge here, of course, is that often there is no
&ldquo;best&rdquo; solution. Different solutions are better for different things.</p>
<p>This is the point where we often talk about <em>tradeoffs</em> &ndash; and
tradeoffs are part of it. But I&rsquo;m also wary of the term. It often
brings to mind a simplistic, zero-sum approach to the problem, where
we can all too easily decide that we have to pick A or B and leave it
at that.</p>
<p>But often when we are faced with two irreconcilable options, A or B,
there is a third one waiting in the wings. This third option often
turns on some hidden assumption that &ndash; once lifted &ndash; allows us to
find a better overall approach; one that satisfies <em>both</em> A <em>and</em> B.</p>
<h3 id="example-the--operator">Example: the <code>?</code> operator</h3>
<p>I think a good example is the <code>?</code> operator. When thinking about error
handling, we seem at first two face two irreconcilable options:</p>
<ul>
<li><strong>Explicit error codes</strong>, like in C, make it easy to see where
errors can occur, but they require tedious code at the call site of
each function to check for errors, when most of the time you just
want to propagate the error anyway. This seems to favor <strong>explicit
reasoning</strong> at the expense of <strong>the happy path</strong>.
<ul>
<li>(In C specifically, there is also the problem of it being easy to
forget to check for errors in the first place, but let&rsquo;s leave
that for now.)</li>
</ul>
</li>
<li><strong>Exceptions</strong> propagate the error implicitly, making the happy path
clean, but making it very hard to see where an error occurs.</li>
</ul>
<p>By now, a number of languages have seen that there is a third way &ndash; a
kind of &ldquo;explicit exception&rdquo;, where you make it very easy and
lightweight to propagate errors In Rust, we do this via the <code>?</code>
operator (which desugars to a match). In Swift (<a href="https://docs.swift.org/swift-book/LanguageGuide/ErrorHandling.html">if I understand
correctly</a>) invoking a method that throws an exception is done
by adding a prefix, like <code>try foo()</code>. Joe Duffy describes a similar
mechanism in the midori language in <a href="http://joeduffyblog.com/2016/02/07/the-error-model/">his epic article dissecting
Midori error handling</a>.</p>
<p>Having used <code>?</code> for a long time now, I can definitely attest that (a)
it is very nice to be able to propagate errors in a light-weight
fasion and (b) having the explicit marker is very useful. Many times
I&rsquo;ve found bugs by scrutinizing the code for <code>?</code>, uncovering
surprising control flow I wasn&rsquo;t considering.</p>
<h3 id="there-is-no-free-lunch">There is no free lunch</h3>
<p>Of course, I&rsquo;d be remiss if I didn&rsquo;t point out that the discussion
over <code>?</code> was a really difficult one for our community. It was <a href="https://github.com/rust-lang/rfcs/pull/243">one of
the longest RFC threads in
history</a>, and one in which
the same arguments seemed to rise up again and again in a kind of
cycle. Moreover, we&rsquo;re <em>still</em> wrestling with what extensions (if any)
we might want to consider to the basic mechanism (e.g., <code>try</code> blocks,
perhaps <code>try</code> fns, etc).</p>
<p>I think part of the reason for this is that &ldquo;the third option ain&rsquo;t
free&rdquo;. In other words, the <code>?</code> operator did a nice job of sidestepping
the dichotomy that seemed to be presented by previous options (clear
but tedious vs elegant but hidden), but it did so by coming into
contact with other principles. In this case, the primary debate was
over whether to consider some mechanism like Haskell&rsquo;s <code>do</code> syntax for
working with monads.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<p>I think this is generally true. All the examples that I can come up
with where we&rsquo;ve found a third option generally come at <em>some</em> sort of
price &ndash; but often it&rsquo;s a price we&rsquo;re content to pay. In the case of
<code>?</code>, this means that we have some syntax in the language that is
dedicated to errors, when <em>perhaps</em> it could have been more general
(but that might itself have come with limitations, or meant more
complexity elsewhere).</p>
<h3 id="rusts-origin-story">Rust&rsquo;s origin story</h3>
<p>Overcoming tradeoffs is, in my view, the <strong>core purpose</strong> of Rust.
After all, the ur-tradeoff of them all is <strong>control vs safety</strong>:</p>
<ul>
<li>Control &ndash; let the programmer decide about memory layout,
threads, runtime.</li>
<li>Safety &ndash; avoid crashes.</li>
</ul>
<p>This choice used to be embodied by having to decide between using C++
(and gaining the control, and thus often performance) or a
garbage-collected language like Java (and sacrifice control, often at
the cost of performance). Deciding <a href="https://blog.rust-lang.org/2015/04/10/Fearless-Concurrency.html">whether or not to use
threads</a>
was a similar choice between peril and performance.</p>
<p>Ownership and borrowing eliminated that tradeoff &ndash; but not for free!
They come with a steep learning curve, after all, and they impose some
limitations of their own. (Flattening the slope of that learning curve
&ndash; and extending the range of patterns that we accept &ndash; was of course
a major goal of the <a href="https://rust-lang.github.io/rfcs/2094-nll.html">non-lexical lifetimes</a> effort, and I think
will continue to be an area of focus for us going forward.)</p>
<h3 id="tradeoffs-after-all--but-the-right-ones">Tradeoffs after all &ndash; but the right ones</h3>
<p>So, even though I maligned tradeoffs earlier as simplistic thinking,
perhaps in the end it <em>does</em> all come down to tradeoffs. Rust is
definitely a language for people who prefer to <a href="https://en.wiktionary.org/wiki/measure_twice_and_cut_once">measure twice and cut
once</a>, and &ndash; for such folks &ndash; learning ownership and borrowing
has proven to be worth the effort (and then some!). But this clearly
isn&rsquo;t the right choice for all people and all situations.</p>
<p>I guess then that the trick is being sure that you&rsquo;re <em>trading the
right things</em>. You will probably have to trade <em>something</em>, but it may
not be the things you&rsquo;re discussing right now.</p>
<h3 id="mapping-the-solution-space">Mapping the solution space</h3>
<p>When we talk about the RFC process, we always emphasize that the point
of RFC discussion <strong>is not to select the best answer</strong>; rather, the
point is to <strong>map the solution space</strong>. That is, to explore what the
possible tradeoffs are and to really look for alternatives.  This
mapping process also means exploring the ups and downs of the current
solutions on the table.</p>
<h3 id="what-does-mapping-the-solution-space-really-mean">What does mapping the solution space really mean?</h3>
<p>When you look at it, &ldquo;mapping the solution space&rdquo; is actually a really
complex task. There are a lot of pieces to it:</p>
<ul>
<li><strong>Identifying stakeholders:</strong> figuring out who are the people
affected by this change, for good or ill.</li>
<li><strong>Clarifying motivations:</strong> what exactly are we aiming to solve with
a given proposal? It&rsquo;s interesting how often this is left unstated
(and, I suspect, not fully understood). Often we have a general idea
of the problem, but we could sharpen it quite a bit. It&rsquo;s also very
useful to figure out which parts of the problem are most important.</li>
<li><strong>Finding the pros and cons of the current proposals:</strong> what works
well with each solution and what are its costs.</li>
<li><strong>Identifying new possibilities:</strong> finding new ways to solve the motivations.
Sometimes this may not solve the <em>complete</em> problem we set out to attack,
but only the most important part &ndash; and that can be a good thing, if it avoids
some of the downsides.</li>
<li><strong>Finding the hidden assumption(s):</strong> This is in some way the same as
identifying new possibilities, but I thought it was worth pulling
out separately.  There often comes a point in the design where you
feel like you are faced with two bad options &ndash; and then you realize
that <strong>one of the design constraints you took as inviolate isn&rsquo;t,
<em>really</em>, all that essential</strong>. Once you weaken that constraint, or
drop it entirely, suddenly the whole design falls into place.</li>
</ul>
<h3 id="our-current-process-mixes-all-of-these-goals">Our current process mixes all of these goals</h3>
<p>Looking at that list of tasks, is it any wonder that some RFC threads
go wrong? The current process doesn&rsquo;t really try to separate out these
various tasks in any way or even to really <strong>highlight</strong> them. We sort of
expect people to &ldquo;figure it out&rdquo; on their own.</p>
<p>Worse, I think the current process often <em>starts with a particular
solution</em>. This encourages people to react to <em>that solution</em>. The RFC
author, then, is naturally prone to be defensive and to defend their
proposal. We are right away kicking things off with an &ldquo;A or B&rdquo;
mindset, where ideas belong to people, rather than the process. I
think &lsquo;disarming&rsquo; the attachment of people to specific ideas, and
instead trying to focus everyone&rsquo;s attention on <strong>the problem space as
a whole</strong>, is crucial.</p>
<p>Now, I am not advocating for some kind of &ldquo;waterfall&rdquo; process
here. <strong>I don&rsquo;t think it&rsquo;s possible to cleanly separate each of the
goals above and handle them one at a time.</strong> It&rsquo;s always a bit messy
&ndash; you start with a fuzzy idea of the problem (and some stakeholders)
and you try to refine it. Then you take a stab at what a solution
might look like, which helps you to understand better the problem
itself, but which also starts to bring in more stakeholders. Figuring
out the pros and cons may spark new ideas. And so forth.</p>
<p>But just because we can&rsquo;t use waterfall doesn&rsquo;t mean we can&rsquo;t give
more structure. Exploring what that might mean is one of the things
I hope to do in subsequent blog posts.</p>
<h3 id="conclusion">Conclusion</h3>
<p>Ultimately, this post is about the importance of being <strong>thorough and
deliberate</strong> in our design efforts. If we truly want to find the best
design &ndash; well, I shouldn&rsquo;t say the best design. If we want to find
the <strong>right design for Rust</strong>, it&rsquo;s often going to take time. This is
because we need to take the time to elaborate on the implications of
the decisions we are making, and to give time for a &ldquo;third way&rdquo; to be
found.</p>
<p>But &ndash; lo &ndash; even here there is a tradeoff. We are trading away
<strong>time</strong>, it seems, for <strong>optimality</strong>. And this clearly isn&rsquo;t always
the right choice.  After all, <a href="http://wiki.c2.com/?RealArtistsShip">&ldquo;real artists ship&rdquo;</a>. Often, there
comes a point where further exploration yields increasingly small
improvements (&ldquo;diminishing returns&rdquo;).</p>
<p>As we explore ways to improve the design process, then, we should try
to ensure we are covering the whole design space, but we also have to
think about knowing when to stop and move on to the next thing.</p>
<h3 id="oh-one-last-thing">Oh, one last thing&hellip;</h3>
<p>Also, by the by, if you&rsquo;ve not already read aturon&rsquo;s 3-part series on
&ldquo;<a href="http://aturon.github.io/2018/05/25/listening-part-1/">listening</a> <a href="http://aturon.github.io/2018/06/02/listening-part-2/">and</a> <a href="http://aturon.github.io/2018/06/18/listening-part-3/">trust</a>&rdquo;, you should do so.</p>
<h3 id="feedback">Feedback</h3>
<p>I&rsquo;ve created a discussion thread on <a href="https://internals.rust-lang.org/t/aic-adventures-in-consensus/9843">the internals forum</a>
where you can leave questions or comments. I&rsquo;ll definitely read them
and I will try to respond, though I often get overwhelmed<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>, so
don&rsquo;t feel offended if I fail to do so.</p>
<h3 id="footnotes">Footnotes</h3>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>If you&rsquo;d like to read more about the <code>?</code> decision, this [summary comment] tried to cover the thread and lay out the reasoning behind the ultimate decision.
[summary comment]: <a href="https://github.com/rust-lang/rfcs/pull/243#issuecomment-172057844">https://github.com/rust-lang/rfcs/pull/243#issuecomment-172057844</a>&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>So many things, so little time.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/consensus" term="consensus" label="Consensus"/></entry><entry><title type="html">More than coders</title><link href="https://smallcultfollowing.com/babysteps/blog/2019/04/15/more-than-coders/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2019/04/15/more-than-coders/</id><published>2019-04-15T00:00:00+00:00</published><updated>2019-04-15T00:00:00+00:00</updated><content type="html"><![CDATA[<p>Lately, the compiler team has been changing up the way that we work.
Our goal is to make it easier for people to track what we are doing
and &ndash; hopefully &ndash; get involved. This is an ongoing effort, but one
thing that has become clear immediately is this: the compiler team
needs more than coders.</p>
<p>Traditionally, when we&rsquo;ve thought about how to &ldquo;get involved&rdquo; in the
compiler team, we&rsquo;ve thought about it in terms of writing PRs. But
more and more I&rsquo;m thinking about all the <em>other</em> jobs that go into
maintaining the compiler. <strong>&ldquo;What kinds of jobs are these?&rdquo;, you&rsquo;re
asking.</strong> I think there are quite a few, but let me give a few
examples:</p>
<ul>
<li><strong>Running a meeting</strong> &ndash; pinging folks, walking through the agenda.</li>
<li><strong>Design documents and other documentation</strong> &ndash; describing how the
code works, even if you didn&rsquo;t write it yourself.</li>
<li><strong>Publicity</strong> &ndash; talking about what&rsquo;s going on, tweeting about
exciting progress, or helping to circulate calls for help. Think
<a href="https://twitter.com/steveklabnik/">steveklabnik</a>, but for rustc.</li>
<li>&hellip;and more! These are just the tip of the iceberg, in my opinion.</li>
</ul>
<p><strong>I think we need to surface these jobs more prominently and try to
actively recruit people to help us with them.</strong> Hence, this blog post.</p>
<h2 id="we-need-an-open-source-whenever">&ldquo;We need an open source whenever&rdquo;</h2>
<p>In <a href="https://nikomatsakis.github.io/rust-latam-2019/#1">my keynote at Rust LATAM</a>, I quoted quite liberally from an
excellent blog post by Jessica Lord, <a href="http://jlord.us/blog/osos-talk.html">&ldquo;Privilege, Community, and Open
Source&rdquo;</a>. There&rsquo;s one passage that keeps coming back to me:</p>
<blockquote>
<p>We also need an open source <em>whenever</em>. Not enough people can or
should be able to spare all of their time for open source work, and
appearing this way really hurts us.</p>
</blockquote>
<p>This passage resonates with me, but I also know it is not as simple as
she makes it sound. Creating a structure where people can meaningfully
contribute to a project with only small amounts of time takes a lot of
work. But it seems clear that the benefits could be huge.</p>
<p>I think looking to tasks beyond coding can be a big benefit
here. Every sort of task is different in terms of what it requires to
do it well &ndash; and I think the more <em>ways</em> we can create for people to
contribute, the more people will be <em>able</em> to contribute.</p>
<h2 id="the-context-working-groups">The context: working groups</h2>
<p>Let me back up and give a bit of context. Earlier, I mentioned that
the compiler has been changing up the way that we work, with the goal
of making it much easier to get involved in developing rustc. A big
part of that work has been introducing the idea of a <strong>working
group</strong>.</p>
<p>A <strong>working group</strong> is basically an (open-ended, dynamic) set of
people working towards a particular goal. These days, whenever the
compiler team kicks off a new project, we create an associated working
group, and we list that group (and its associated Zulip stream) on
<a href="https://github.com/rust-lang/compiler-team">the compiler-team repository</a>. There is also a <a href="https://github.com/rust-lang/compiler-team#meeting-calendar">central
calendar</a> that lists all the group meetings and so forth. This
makes it pretty easy to quickly see what&rsquo;s going on.</p>
<h2 id="working-groups-as-a-way-into-the-compiler">Working groups as a way into the compiler</h2>
<p>Working groups provide an ideal vector to get involved with the
compiler. For one thing, they give people a more approachable target
&ndash; you&rsquo;re not working on &ldquo;the entire compiler&rdquo;, you&rsquo;re working towards
a particular goal. Each of your PRs can then be building on a common
part of the code, making it easier to get started. Moreover, you&rsquo;re
working with a smaller group of people, many of whom are also just
starting out. This allows people to help one another and form a
community.</p>
<h2 id="running-a-working-group-is-a-big-job">Running a working group is a big job</h2>
<p>The thing is, running a working group can be quite a big job &ndash;
particularly a working group that aims to incorporate a lot of
contributors. Traditionally, we&rsquo;ve thought of a working group as
having a <strong>lead</strong> &ndash; maybe, at best, two leads &ndash; and a bunch of
participants, most of whom are being mentored:</p>
<pre tabindex="0"><code>           +-------------+
           | Lead(s)     |
           |             |
           +-------------+

  +--+  +--+  +--+  +--+  +--+  +--+
  |  |  |  |  |  |  |  |  |  |  |  |
  |  |  |  |  |  |  |  |  |  |  |  |
  |  |  |  |  |  |  |  |  |  |  |  |
  +--+  +--+  +--+  +--+  +--+  +--+
  
  |                                |
  +--------------------------------+
   (participants)
</code></pre><p>Now, if all these participants are all being mentored to write code,
that means that the set of jobs that fall on the leads is something
like this:</p>
<ul>
<li>Running the meeting</li>
<li>Taking and posting minutes from the meeting</li>
<li>Figuring out the technical design</li>
<li>Writing the big, complex PRs that are hard to mentor</li>
<li>Writing the design documents</li>
<li>Writing mentoring instructions</li>
<li>Writing summary blog posts and trying to call attention to what&rsquo;s going on</li>
<li>Synchronizing with the team at large to give status updates etc</li>
<li>Being a &ldquo;point of contact&rdquo; for questions</li>
<li>Helping contributors debug problems</li>
<li>Triaging bugs and ensuring that the most important ones are getting fixed</li>
<li>&hellip;</li>
</ul>
<p>Is it any wonder that the vast majority of working group leads have
full-time, paid employees? Or, alternatively, is it any wonder that
often many of those tasks just don&rsquo;t get done?</p>
<p>(Consider the NLL working group &ndash; there, we had both Felix and I
working as full-time leads, essentially. Even so, we had a hard time
writing out design documents, and there were never enough summary blog
posts.)</p>
<h2 id="running-a-working-group-is-really-a-lot-of-smaller-jobs">Running a working group is really a lot of smaller jobs</h2>
<p>The more I think about it, the more I think the flaw is in the way
we&rsquo;ve talked about a &ldquo;lead&rdquo;. Really, &ldquo;lead&rdquo; for us was mostly a kind
of shorthand for &ldquo;do whatever needs doing&rdquo;. I think we should be
trying to get more precise about what those things are, and then that
we should be trying to split those roles out to more people.</p>
<p>For example, how awesome would it be if major efforts had some people
who were just trying to ensure that the design was <strong>documented</strong> &ndash;
working on <a href="https://rust-lang.github.io/rustc-guide/">rustc-guide</a> chapters, for example, showing the major
components and how they communicated. This is not easy work. It
requires a pretty detailed technical understanding. It does not,
however, really require <em>writing the PRs in question</em> &ndash; in fact,
ideally, it would be done by different people, which ensures that
there are multiple people who understand how the code works.</p>
<p>There will still be a need, I suspect, for some kind of &ldquo;lead&rdquo; who is
generally overseeing the effort. But, these days, I like to think of
it in a somewhat less&hellip; hierarchical fashion. Perhaps &ldquo;organizer&rdquo; is
the right term. I&rsquo;m not sure.</p>
<h2 id="each-job-is-different">Each job is different</h2>
<p>Going back to <a href="http://jlord.us/blog/osos-talk.html">Jessica Lord&rsquo;s post</a>, she continues:</p>
<blockquote>
<p>We need everything we can get and are thankful for all that you can
contribute whether it is two hours a week, one logo a year, or a
copy-edit twice a year.</p>
</blockquote>
<p>Looking over the list of tasks that are involved in running a
working-group, it&rsquo;s interesting how many of them have distinct time
profiles. Coding, for example, is a pretty intensive activity that can
easily take a kind of &ldquo;unbounded&rdquo; amount of time, which is something
not everyone has available. But consider the job of <strong>running a weekly
sync meeting</strong>.</p>
<p>Many working groups use short, weekly sync meetings to check up on
progress and to keep everything progressing. It&rsquo;s a good place for
newcomers to find tasks, or to triage new bugs and make sure they are
being addressed. One easy, and self-contained, task in a working group
might be to <strong>run the weekly meetings</strong>.  This could be as simple as
coming onto Zulip at the right time, pinging the right people, and
trying to walk through the status updates and take some
minutes. However, it might also get more complex &ndash; e.g., it might
involve doing some pre-triage to try and shape up the agenda.</p>
<p>But note that, however you do it, this task is relatively
time-contained &ndash; it occurs at a predictable point in the week. It
might be a way for someone to get involved who has a fixed hole in
their schedule, but can&rsquo;t afford the more open-ended, coding tasks.</p>
<h2 id="just-as-important-as-code">Just as important as code</h2>
<p>In my last quote from <a href="http://jlord.us/blog/osos-talk.html">Jessica Lord&rsquo;s post</a>, I left out the
last sentence from the paragraph.  Let me give you the paragraph in
full (emphasis mine):</p>
<blockquote>
<p>We need everything we can get and are thankful for all that you can
contribute whether it is two hours a week, one logo a year, or a
copy edit twice a year. <strong>You, too, are a first class open source
citizen.</strong></p>
</blockquote>
<p>I think this is a pretty key point. I think it&rsquo;s important that we
recognize that <strong>working on the compiler is more than coding</strong> &ndash; and
that we value those tasks &ndash; whether they be organizational tasks,
writing documentation, whatever &ndash; equally.</p>
<p>I am worried that if we had working groups where some people are
writing the code and there is somebody else who is &ldquo;only&rdquo; running the
meetings, or &ldquo;only&rdquo; triaging bugs, or &ldquo;only&rdquo; writing design docs, that
those people will feel like they are not &ldquo;real&rdquo; members of the working
group. But to my mind they are equally essential, if not more
essential. <strong>After all, it&rsquo;s a lot easier to find people who will
spend their free time writing PRs than it is to find people who will
help to organize a meeting.</strong></p>
<h2 id="growing-the-compiler-team">Growing the compiler team</h2>
<p>The point of this post, in case you missed it, is that <strong>I would like to grow
our conception of the compile team beyond coders</strong>. I think we should be actively
recruiting folks with a lot of different skill sets and making them full members
of the compiler team:</p>
<ul>
<li>organizers and project managers</li>
<li>documentation authors</li>
<li>code evangelists</li>
</ul>
<p>I&rsquo;m not really sure what this full set of roles should be, but I know
that the compiler team cannot function without them.</p>
<h2 id="beyond-the-compiler-team">Beyond the compiler team</h2>
<p>One other note: I think that when we start going down this road, we&rsquo;ll
find that there is overlap between the &ldquo;compiler team&rdquo; and other teams
in the rust-lang org.  For example, the release team already does a
great job of tracking and triaging bugs and regressions to help ensure
the overall quality of the release. But perhaps the compiler team also
wants to do its own triaging. Will this lead to a &ldquo;turf war&rdquo;?
Personally, I don&rsquo;t really see the conflict here.</p>
<p>One of the beauties of being an open-source community is that we don&rsquo;t
need to form strict managerial hierarchies. We can have the same
people be members of <em>both</em> the release team <em>and</em> the compiler
team. As part of the release team, they would presumably be doing more
general triaging and so forth; as part of the compiler team, they
would be going deeper into rustc. But still, it&rsquo;s a good thing to pay
attention to. Maybe some things don&rsquo;t belong in the compiler-team
proper.</p>
<h2 id="conclusion">Conclusion</h2>
<p>I don&rsquo;t quite a have a <strong>call to action</strong> here, at least not yet. This
is still a WIP &ndash; we don&rsquo;t know quite the right way to think about
these non-coding roles. I think we&rsquo;re going to be figuring that out,
though, as we gain more experience with working groups.</p>
<p>I guess I <strong>can</strong> say this, though: <strong>If you are a project manager or
a tech writer</strong>, and you think you&rsquo;d like to get more deeply involved
with the compiler team, now&rsquo;s a good time. =) Start attending our
<a href="https://github.com/rust-lang/compiler-team/blob/master/about/steering-meeting.md">steering meetings</a>, or perhaps the weekly meetings
of the <a href="https://github.com/rust-lang/compiler-team/tree/master/working-groups/meta">meta working group</a>, or just ping me over on <a href="https://github.com/rust-lang/compiler-team/blob/master/about/chat-platform.md">the
rust-lang Zulip</a>.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/governance" term="governance" label="Governance"/></entry><entry><title type="html">Async-await status report</title><link href="https://smallcultfollowing.com/babysteps/blog/2019/03/01/async-await-status-report/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2019/03/01/async-await-status-report/</id><published>2019-03-01T00:00:00+00:00</published><updated>2019-03-01T00:00:00+00:00</updated><content type="html"><![CDATA[<p>I wanted to post a quick update on the status of the async-await
effort. The short version is that we&rsquo;re in the <strong>home stretch</strong> for
some kind of stabilization, but there remain some significant
questions to overcome.</p>
<h2 id="announcing-the-implementation-working-group">Announcing the implementation working group</h2>
<p>As part of this push, I&rsquo;m happy to announce we&rsquo;ve formed a
<a href="https://github.com/rust-lang/compiler-team/blob/master/README.md"><strong>async-await implementation working group</strong></a>. This working group
is part of the whole async-await effort, but focused on the
implementation, and is part of the compiler team. If you&rsquo;d like to
help get async-await over the finish line, we&rsquo;ve got a list of issues
where we&rsquo;d definitely like help (read on).</p>
<p><strong>If you are interested in taking part, we have an &ldquo;office hours&rdquo;
scheduled for Tuesday (see the <a href="https://github.com/rust-lang/compiler-team#meeting-calendar%5B">compiler team calendar</a>)</strong> &ndash; if you
can show up then on <a href="https://github.com/rust-lang/compiler-team/blob/master/about/chat-platform.md">Zulip</a>, it&rsquo;d be ideal! (But if not, just pop in any
time.)</p>
<h2 id="who-are-we-stabilizing-for">Who are we stabilizing for?</h2>
<p>I mentioned that there remain significant questions to overcome before
stabilization. I think the most root question of all is this one:
<strong>Who is the audience for this stabilization?</strong></p>
<p>The reason that question is so important is because it determines how
to weigh some of the issues that currently exist. If the point of the
stabilization is to start promoting async-await as something for
<strong>widespread use</strong>, then there are <strong>issues that we probably ought to
resolve first</strong> &ndash; most notably, the <code>await</code> syntax, but also other
things.</p>
<p>If, however, the point of stabilization is to let <strong>&rsquo;early adopters&rsquo;</strong>
start playing with it more, then <strong>we might be more tolerant of
problems</strong>, so long as there are no backwards compatibility concerns.</p>
<p>My take is that either of these is a perfectly fine answer. But <strong>if
the answer is that we are trying to unblock early adopters, then we
want to be clear in our messaging</strong>, so that people don&rsquo;t get turned
off when they encounter some of the bugs below.</p>
<p>OK, with that in place, let&rsquo;s look in a bit more detail.</p>
<h2 id="implementation-issues">Implementation issues</h2>
<p>One of the first things that we did in setting up the implementation
working group is to do a <a href="https://paper.dropbox.com/doc/Async-Await-Triage-2019.02.20--AYdZ6puVcqdJ0Jnu37FRiisiAg-ZyzRUbTENfdgFjCRja2vm">complete triage of all existing async-await
issues</a>. From this, we found that there was one very
firm blocker, <a href="https://github.com/rust-lang/rust/issues/54716">#54716</a>. This issue has to do the timing of drops in
an async fn, specifically the drop order for parameters that are not
used in the fn body.  We want to be sure this behaves analogously with
regular functions. This is a blocker to stabilization because it would
change the semantics of stable code for us to fix it later.</p>
<p>We also uncovered a number of <strong>major ergonomic problems</strong>. In a
follow-up meeting (<a href="https://youtu.be/xe2_whJWBC0">available on YouTube</a>), cramertj and I
also drew up plans for <strong>fixing these bugs</strong>, though these plans have
not yet been writting into mentoring instructions. These issues
include all focus around async fns that take borrowed references as
arguments &ndash; for example, the <a href="https://github.com/rust-lang/rust/issues/56238">async fn syntax today doesn&rsquo;t support
more than one lifetime in the
arguments</a>, so
something like <code>async fn foo(x: &amp;u32, y: &amp;u32)</code> doesn&rsquo;t work.</p>
<p>Whether these ergonomic problems are <strong>blockers</strong>, however, depends a
bit on your perspective: as @cramertj says, a number of folks at
Google are using async-await today productively despite these
limitations, but you must know the appropriate workarounds and so
forth. <strong>This is where the question of our audience comes into play.</strong>
My take is that these issues are blockers for &ldquo;async fn&rdquo; being ready
for &ldquo;general use&rdquo;, but probably not for &ldquo;early adopters&rdquo;.</p>
<p>Another big concern for me personally is the <strong>maintenance story</strong>.
Thanks to the hard work of Zoxc and cramertj, we&rsquo;ve been able to
standup a functional async-await implementation very fast, which is
awesome. But we don&rsquo;t really have a large pool of active contributors
working on the async-await implementation who can help to fix issues
as we find them, and this seems bad.</p>
<h2 id="the-syntax-question">The syntax question</h2>
<p>Finally, we come to the question of the <code>await</code> syntax. At the All
Hands, we had a number of conversations on this topic, and it became
clear that <strong>we do not presently have consensus for any one syntax</strong>.
We did a <strong>lot</strong> of exploration here, however, and enumerated a number
of subtle arguments in favor of each option. At this moment,
@withoutboats is busily trying to write-up that exploration into a
document.</p>
<p>Before saying anything else, it&rsquo;s worth pointing out that we don&rsquo;t
actually <strong>have</strong> to resolve the <code>await</code> syntax in order to stabilize
async-await. We could stabilize the <code>await!(...)</code> macro syntax for the
time being, and return to the issue later. This would unblock &ldquo;early
adopters&rdquo;, but doesn&rsquo;t seem like a satisfying answer if our target is
the &ldquo;general public&rdquo;. If we were to do this, we&rsquo;d be drawing on the
precedent of <code>try!</code>, where we first adopted a macro and later moved
that support to native syntax.</p>
<p>That said, we do <strong>eventually</strong> want to pick another syntax, so it&rsquo;s
worth thinking about how we are going to do that. As I wrote, the
first step is to complete an overall summary that tries to describe
the options on the table and some of the criteria that we can use to
choose between them. Once that is available, we will need to settle on
next steps.</p>
<h2 id="resolving-hard-questions">Resolving hard questions</h2>
<p>I am looking at the syntax question as a kind of opportunity &ndash; one of
the things that we as a community frequently have to do is to find a
way to <strong>resolve really hard questions without a clear answer</strong>. The
tools that we have for doing this at the moment are really fairly
crude: we use discussion threads and manual summary
comments. Sometimes, this works well. Sometimes, amazingly well. But
other times, it can be a real drain.</p>
<p>I would like to see us trying to resolve this sort of issue in other
ways. I&rsquo;ll be honest and say that I don&rsquo;t entirely know what those
are, <strong>but I know they are not open discussion threads</strong>. For example,
I&rsquo;ve found that the #rust2019 blog posts have been an incredibly
effective way to have an open conversation about priorities without
the usual ranchor and back-and-forth. I&rsquo;ve been very inspired by
systems like <a href="https://www.technologyreview.com/s/611816/the-simple-but-ingenious-system-taiwan-uses-to-crowdsource-its-laws/">vTaiwan</a>, which enable a lot of public input, but in a
structured and collaborative form, rather than an &ldquo;antagonistic&rdquo;
one. Similarly, I would like to see us perhaps consider running more
<em>experiments</em> to test hypotheses about learnability or other factors
(but this is something I would approach with great caution, as I think
designing good experiments is very hard).</p>
<p>Anyway, this is really a topic for a post of its own. In this
particular case, I hope that we find that enumerating in detail the
arguments for each side leads us to a clear conclusion, perhaps some
kind of &ldquo;third way&rdquo; that we haven&rsquo;t seen yet. But, thinking ahead,
it&rsquo;d be nice to find ways to have these conversations that take us to
that &ldquo;third way&rdquo; faster.</p>
<h2 id="closing-notes">Closing notes</h2>
<p>As someone who has not been closely following async-await thus far,
I&rsquo;m super excited by all I see. The feature has come a ridiculously
long way, and the remaining blockers all seem like things we can
overcome. async await is coming: I can&rsquo;t wait to see what people build
with it.</p>
<p><a href="https://internals.rust-lang.org/t/async-foundations-working-group-status/9540/2?u=nikomatsakis">Cross-posted to internals here.</a></p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/></entry><entry><title type="html">Rust lang team working groups</title><link href="https://smallcultfollowing.com/babysteps/blog/2019/02/22/rust-lang-team-working-groups/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2019/02/22/rust-lang-team-working-groups/</id><published>2019-02-22T00:00:00+00:00</published><updated>2019-02-22T00:00:00+00:00</updated><content type="html"><![CDATA[<p>Now that the Rust 2018 edition has shipped, the language design team
has been thinking a lot about what to do in 2019 and over the next
few years. I think we&rsquo;ve got a lot of exciting stuff on the horizon,
and I wanted to write about it.</p>
<h2 id="theme-for-this-edition">Theme for this edition</h2>
<p>In 2015, our overall theme was <strong>stability</strong>. For the 2018 Edition, we adopted
<strong>productivity</strong>. For Rust 2021<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>, we are thinking of <strong>maturity</strong> as the theme.
Our goal is finish up a number of in-flight features &ndash; such as specialization,
generic associated types, and const generics &ndash; that have emerged as key enablers
for future work. In tandem, we aim to start improving our reference material,
both through continuing the great work that&rsquo;s been done on the Rust reference
but also through more specialized efforts like the Grammar and Unsafe Code Guidelines
working groups.</p>
<h2 id="working-groups">Working groups</h2>
<p>Actually, the thing I&rsquo;m most excited about has nothing to do with the
language at all, but rather a change to how we operate. We are planning
to start focusing our operations on a series of <strong>lang team working groups</strong>.
Each working group is focusing on a specific goal. This can be as narrow
as a single RFC, or it might be a family of related RFCs (&ldquo;async code&rdquo;, &ldquo;FFI&rdquo;).</p>
<p>The plan is to repurpose our weekly meeting. Each week we will do some amount
of triage, but also check in with one working group. In the days leading up to the meeting,
<strong>the WG will post a written report describing the agenda</strong>: this report should
review what happened since the last chat, discuss thorny questions, help assess priorities, and
plan the upcoming roadmap. <strong>These meetings will be
recorded and open to anyone who wants to attend.</strong> Our hope in particular is that
active working group participants will join the meeting.</p>
<p>Finally, as part of this move, we are creating a <a href="https://github.com/rust-lang/lang-team/">lang team repository</a> which will serve as the &ldquo;home&rdquo; for the lang team. It&rsquo;ll describe our process,
list the active working groups, and also show the ideas that are on the &ldquo;shortlist&rdquo; &ndash; basically,
things we expect to start doing once we wrap some of our ongoing work. The repository will
also have advice for how to get involved.</p>
<h2 id="initial-set-of-active-working-groups">Initial set of active working groups</h2>
<p>We&rsquo;ve also outlined what we expect to be our initial set of active working groups.
This isn&rsquo;t a final list: we might add a thing or two, or take something away. The list
more or less maps to the &ldquo;high priority&rdquo; endeavors that are already in progress.</p>
<p>For each working group, we also have a rough idea for who the &ldquo;leads&rdquo; will be. The
leads of a working group are those helping to keep it organized and functonal. Note that
some leads are not members of the lang team. In fact, helping to co-lead a working group
is a great way to get involved with language design, and also a good stepping stone to full team
membership if desired.</p>
<ul>
<li><strong>Traits working group:</strong>
<ul>
<li>Focused on working out remaining design details of specialization, GATs,
<code>impl Trait</code>, and other trait-focused features.</li>
<li>Working closely with the compiler traits working group on implementation.</li>
<li>Likely leads: aturon, nmatsakis, centril</li>
</ul>
</li>
<li><strong>Grammar working group:</strong>
<ul>
<li>Focused on developing a canonical grammar, following roughly the process
laid out in <a href="https://rust-lang.github.io/rfcs/1331-grammar-is-canonical.html">RFC 1331</a>.</li>
<li>Likely leads: qmx, centril, eddyb</li>
</ul>
</li>
<li><strong>Async: Foundations</strong>
<ul>
<li>Focused on core language features like async-await or the <code>Futures</code> trait
that enable async I/O.
<ul>
<li>Distinct from the &ldquo;Async: Ecosystem&rdquo; domain working group, which will focus on
bolstering the ecosystem for async code through new crates and documentation.</li>
</ul>
</li>
<li>Likely leads: cramertj, boats</li>
</ul>
</li>
<li><strong>Unsafe code guidelines</strong>
<ul>
<li>Focused on developing rules for unsafe code: what is allowed, what is not.</li>
<li>Likely leads: avacadavara, nikomatsakis, pnkfelix</li>
</ul>
</li>
<li><strong>Foreign function interface</strong>
<ul>
<li>Focused on ensuring that Rust and C programs can seamlessly and ergonomically
interact. The goal is to permit Rust code to call or be called by any C function
and handle any C data structure, as well as all common systems code scenarios
and supporting inline assembly.</li>
<li>Likely leads: joshtriplett</li>
</ul>
</li>
</ul>
<h2 id="bootstrapping-the-working-groups">Bootstrapping the working groups</h2>
<p>Over the next few weeks, we expect to be &ldquo;bootstrapping&rdquo; these working groups.
(In some cases, like grammar and the unsafe code guidelines, these groups
are already quite active, but in others they are not or have not been
formally organized.) For each group, we&rsquo;ll be putting out a call to get involved,
and trying to draw up an initial roadmap laying out where we are now and what the
next few steps we&rsquo;ll be. <strong>If something on that list looks like something you&rsquo;d
like to help with, stay tuned!</strong></p>
<h2 id="looking-to-2019-and-beyond">Looking to 2019 and beyond</h2>
<p>The set of roadmaps listed there aren&rsquo;t meant to be an exhaustive list of the
things we plan to do. Rather, they are meant to be a starting point: these are
largely the activites we are currently doing, and we plan to focus on those and
see them to completion (though the FFI working group is something of a new focus).</p>
<p>The idea is that, as those working groups wind down and bandwidth becomes available,
we will turn out focus to new things. To that end, we aim to draw up a shortlist
and post it on the website, so that you have some idea the range of things we are considering
for the future. Note that the mere presence of an idea on the shortlist is not a guarantee
that it will come to pass: it may be that in working through the proposed idea, we decide
we don&rsquo;t want it, and so forth.</p>
<h2 id="conclusion">Conclusion</h2>
<p>2019 is going to be a big year for the lang team &ndash; not only because of the work
we plan to do, but because of the way we plan to do it. I&rsquo;m really looking forward
to it, and I hope to see you all soon at a WG meeting!</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Assuming we do a Rust 2021 edition, which I expect we will.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/></entry><entry><title type="html">Salsa: Incremental recompilation</title><link href="https://smallcultfollowing.com/babysteps/blog/2019/01/29/salsa-incremental-recompilation/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2019/01/29/salsa-incremental-recompilation/</id><published>2019-01-29T00:00:00+00:00</published><updated>2019-01-29T00:00:00+00:00</updated><content type="html"><![CDATA[<p>So for the last couple of months or so, I&rsquo;ve been hacking in my spare
time on this library named
<a href="https://github.com/salsa-rs/salsa"><strong>salsa</strong></a>, along with a <a href="https://github.com/salsa-rs/salsa/graphs/contributors">number
of awesome other
folks</a>. Salsa
basically extracts the incremental recompilation techniques that we
built for rustc into a general-purpose framework that can be used by
other programs. Salsa is developing quickly: with the publishing of
v0.10.0, we saw a big step up in the overall ergonomics, and I think
the current interface is starting to feel very nice.</p>
<p>Salsa is in use by a number of other projects. For example, matklad&rsquo;s
<a href="https://github.com/rust-analyzer/rust-analyzer/">rust-analyzer</a>, a
nascent Rust IDE, is using salsa, as is the
<a href="https://github.com/lark-exploration/lark">Lark</a><sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>
compiler. Notably, <strong>rustc does not</strong> &ndash; it still uses its own
incremental engine, which has some pros and cons compared to
salsa.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></p>
<p>If you&rsquo;d like to learn more about Salsa, you can check out [the <a href="https://github.com/salsa-rs/salsa/blob/master/examples/hello_world/main.rs">Hello
World
example</a> &ndash; but, even better, you can check out two videos that I just recorded:</p>
<ul>
<li><a href="https://youtu.be/_muY4HjSqVw">How Salsa Works</a>, which gives
a high-level introduction to the key concepts involved and shows how to use salsa;</li>
<li><a href="https://www.youtube.com/watch?v=i_IhACacPRY">Salsa In More Depth</a>, which really digs
into the incremental algorithm and explains &ndash; at a high-level &ndash; how Salsa is implemented.
<ul>
<li>Thanks to Jonathan Turner for helping me to make this one!</li>
</ul>
</li>
</ul>
<p>If you&rsquo;re interested in salsa, please jump on to our Zulip instance at
<a href="https://salsa.zulipchat.com/">salsa.zulipchat.com</a>. It&rsquo;s a really fun
project to hack on, and we&rsquo;re definitely still looking for people to
help out with the implementation and the design. Over the next few
weeks, I expect to be outlining a &ldquo;path to 1.0&rdquo; with a number of
features that we need to push over the finish line.</p>
<h1 id="footnotes">Footnotes</h1>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>&hellip;worthy of a post of its own, but never mind.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>I would like to eventually port rustc to salsa, but it&rsquo;s not a direct goal.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/salsa" term="salsa" label="Salsa"/></entry><entry><title type="html">Polonius and the case of the hereditary harrop predicate</title><link href="https://smallcultfollowing.com/babysteps/blog/2019/01/21/hereditary-harrop-region-constraints/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2019/01/21/hereditary-harrop-region-constraints/</id><published>2019-01-21T00:00:00+00:00</published><updated>2019-01-21T00:00:00+00:00</updated><content type="html"><![CDATA[<p>In my <a href="https://smallcultfollowing.com/babysteps/
/blog/2019/01/17/polonius-and-region-errors/">previous post</a> about Polonius and subregion obligations, I
mentioned that there needs to be a follow-up to deal with
higher-ranked subregions. This post digs a bit more into what the
<em>problem</em> is in the first place and sketches out the general solution
I have in mind, but doesn&rsquo;t give any concrete algorithms for it.</p>
<h3 id="the-subset-relation-in-polonius-is-not-enough">The subset relation in Polonius is not enough</h3>
<p>In my original post on Polonius, I assumed that when we computed a
subtype relation <code>T1 &lt;: T2</code> between two types, the result was either a
hard error or a set of <code>subset</code> relations between various regions.
So, for example, if we had a subtype relation between two references:</p>
<pre tabindex="0"><code>&amp;&#39;a u32 &lt;: &amp;&#39;b u32
</code></pre><p>the result would be a <code>subset</code> relation <code>'a: 'b</code> (or, &ldquo;<code>'a</code> contains a
subset of the loans in <code>'b</code>&rdquo;).</p>
<p>For a more complex case, consider the relationship of two fn types:</p>
<pre tabindex="0"><code>fn(&amp;&#39;a u32) &lt;: fn(&amp;&#39;b u32)
// ^^^^^^^     ^^^^^^^^^^
// |           A fn expecting a `&amp;&#39;b u32` as argument.
// |
// A fn expecting a `&amp;&#39;a u32` as argument.
</code></pre><p>If we imagine that we have some variable <code>f</code> of type <code>fn(&amp;'a u32)</code> &ndash;
that is, a fn that can be called with a <code>'a</code> reference &ndash; then this
subtype relation is saying that <code>f</code> can be given the type <code>fn(&amp;'b u32)</code> &ndash; that is, a fn that can be called with a <code>'b</code> reference.  That
is fine so long as that <code>'b</code> reference can be used as a <code>'a</code>
reference: that is, <code>&amp;'b u32 &lt;: &amp;'a u32</code>. So, we can say that the two
fn types are subtypes so long as <code>'b: 'a</code> (note that the order is
reversed from the first example; this is because fn types are
<em>contravariant</em> in their argument type).</p>
<p>Unfortunately, this structure isn&rsquo;t flexible enough to accommodate a
subtyping question involving higher-ranked types. Consider a subtype
relation like this:</p>
<pre tabindex="0"><code>fn(&amp;&#39;a u32) &lt;: for&lt;&#39;b&gt; fn(&amp;&#39;b u32)
//             ^^^^^^^
//             Unlike before, the supertype
//             expects a reference with *any*
//             lifetime as argument.
</code></pre><p>What subtype relation should come from this? We can&rsquo;t say <code>'b: 'a</code> as
before, because the lifetime <code>'b</code> isn&rsquo;t some specific region &ndash;
rather, the supertype says that the function has to accept a reference
with <em>any</em> lifetime <code>'b</code>. In fact, this subtyping relation should
ultimately yield an error.</p>
<h3 id="richer-constraints">Richer constraints</h3>
<p>To express the constraints that arise from higher-ranked subtyping
(and trait matching), we need a richer set of constraints than just
subset. In fact, if you tease it all out, we need something more like
this:</p>
<pre tabindex="0"><code>Constraint = Subset
           | Constraint, Constraint // and
           | forall&lt;R1&gt; { Constraint }
           | exists&lt;R1&gt; { Constraint }

Subset = R1: R2  
</code></pre><p>Now we can say that</p>
<pre tabindex="0"><code>fn(&amp;&#39;a u32) &lt;: for&lt;&#39;b&gt; fn(&amp;&#39;b u32)
</code></pre><p>holds if the constraint <code>forall&lt;'b&gt; { 'b: 'a }</code> holds, which implies
that <code>'a</code> has to contain all possible loans. This isn&rsquo;t possible, and
so we would treat this as an error.</p>
<p>Interestingly, if we reverse the order of the two types:</p>
<pre tabindex="0"><code>for&lt;&#39;b&gt; fn(&amp;&#39;b u32) &lt;: fn(&amp;&#39;a u32)
</code></pre><p>we get the constraint <code>exists&lt;'b&gt; { 'a: 'b }</code> (<code>for</code> binders on the
<em>subtype</em> side are instantiated with &ldquo;there exists&rdquo;, not &ldquo;for
all&rdquo;). That is, the region <code>'a</code> must be a subset of some possible set
of loans <code>'b</code>. This constraint is trivially solveable: <code>'b</code> could
always be exactly <code>'a</code> itself.</p>
<h3 id="the-role-of-free-vs-bound-regions">The role of free vs bound regions</h3>
<p>As one final example, consider what happens here, where we added a
return type and another region (<code>'c</code>):</p>
<pre tabindex="0"><code>for&lt;&#39;b&gt; fn(&amp;&#39;b u32) -&gt; &amp;&#39;b u32
           &lt;:
        fn(&amp;&#39;a u32) -&gt; &amp;&#39;c u32
</code></pre><p>This gives rise to the following constraint:</p>
<pre tabindex="0"><code>exists&lt;&#39;b&gt; {
  &#39;a: &#39;b, // from relating the parameter types
  &#39;b: &#39;c, // from relating the return types
}
</code></pre><p>Here, the constraint is solveable, but only if <code>'a: 'c</code>. Therefore, if
we think back to Polonius with its simple &ldquo;subset&rdquo; relations, we can
effective <em>reduce</em> this &ldquo;rich&rdquo; constraint to the subset relation <code>'a: 'c</code>.</p>
<p>To do this reduction, we draw a distinction between the <em>bound</em> and
the <em>free</em> regions. <em>Bound</em> regions are those that are bound within a
<code>forall</code> and <code>exists</code> quantifiers (e.g., <code>'b</code>), and <em>free</em> regions
those that are not. When we are reducing, we only care about two things:</p>
<ul>
<li><strong>Do we have something <em>unsatisfiable</em> about the constraint?</strong> This
often happens when a bound &ldquo;forall&rdquo; region is on the right-hand
side.
<ul>
<li>We saw this with <code>forall&lt;'b&gt; { 'b: 'a }</code>.</li>
<li>Another example is <code>forall&lt;'x, 'y&gt; { 'x: 'y }</code>.</li>
<li>Reducing something unsatisfiable is obviously an error.</li>
</ul>
</li>
<li><strong>What are the effects on the free regions?</strong> Othertimes, bound
regions effectively as a &ldquo;go-between&rdquo;, creating subset relations
between the free regions.
<ul>
<li>We saw this with <code>exists&lt;'b&gt; { 'a: 'b, 'b: 'c }</code>.</li>
<li>Another, stranger example might be <code>forall&lt;'b&gt; { 'a: 'b }</code>: here,
this is satisfiable, but only if <code>'a: 'static</code>. This is true
because <code>'static: 'b</code> is implicitly true (<code>'static: R</code> is true for
any region R).
<ul>
<li>In Polonius terms, <code>'static</code> represents an &ldquo;empty set&rdquo; of loans,
so this effectively means that <code>'a</code> can be a subset of any
region <code>'b</code> by being the empty set.</li>
</ul>
</li>
</ul>
</li>
</ul>
<h3 id="wait-those-richer-constraints-look-familiar">Wait, those &ldquo;richer&rdquo; constraints look familiar&hellip;</h3>
<p>The &ldquo;richer&rdquo; constraints I mentioned in the previous section basically
arise from taking a base predicate (<code>R1: R2</code>) and &ldquo;adding in&rdquo; richer
constraint forms like &ldquo;for all&rdquo; and &ldquo;there exists&rdquo;. This may sound
familiar &ndash; if you recall <a href="https://smallcultfollowing.com/babysteps/
/blog/2017/01/26/lowering-rust-traits-to-logic/#type-checking-generic-functions-beyond-horn-clauses">my very first Chalk post</a>, I talked
about the <a href="https://smallcultfollowing.com/babysteps/
/blog/2017/01/26/lowering-rust-traits-to-logic/#type-checking-generic-functions-beyond-horn-clauses">need to go beyond Prolog&rsquo;s core &ldquo;Horn clauses&rdquo; and to
support &ldquo;Hereditary Harrop&rdquo; (HH) predicates</a>. The basic idea was
to extend simple Horn clauses with &ldquo;for all&rdquo; and &ldquo;there exists&rdquo;, along
with a few other things.</p>
<p>In fact, &ldquo;hereditary harrop predicates&rdquo; are a kind of generic
structure that we can apply to any base set of predicates. So, if we
wanted, we might say that the region constraints we are creating can
be extended to the full hereditary harrop form, which would look like
so:</p>
<pre tabindex="0"><code>Constraint = Subset
           | Constraint, Constraint // and
           | Constraint; Constraint // or
           | forall&lt;R1&gt; { Constraint }
           | exists&lt;R1&gt; { Constraint }
           | if (Assumption) { Constraint }
           
Assumption = Subset
           | forall&lt;R1&gt; { Assumption }
           | if (Constraint) { Assumption }

Subset = R1: R2  
</code></pre><p>Here we support not only &ldquo;for all&rdquo; and &ldquo;there exists&rdquo; but also
&ldquo;implication&rdquo; and even &ldquo;or&rdquo;. rustc doesn&rsquo;t use constraints this rich
today, but for various reasons I think we will want to eventually.</p>
<p>Why is it useful to talk about HH predicates? Well, HH predicates have
the nice property that we can use basic Prolog-style search to find
and enumerate all possible solutions to them. Besides, &ldquo;hereditary
harrop&rdquo; is really fun to say.</p>
<h3 id="conclusion">Conclusion</h3>
<p>So now we have this problem. To encode the &ldquo;solutions&rdquo; to
higher-ranked subtyping and trait matching, we need to use this richer
notion of constraints that include <code>forall</code> and <code>exists</code>
quantifiers. Once we add those, we are basically talking about
&ldquo;hereditary harrop region constraints&rdquo;. We&rsquo;ve also talked about the
idea of mapping these complex constraints down to the simple subset
relation that Polonius uses, but here I only gave examples and didn&rsquo;t
really give any sort of <em>algorithm</em>. I&rsquo;ve done some experiments here,
and I may try to write them up in a future post, but I&rsquo;m also curious
to know if somebody else has already solved this problem. I definitely
have that &ldquo;reinventing the wheel&rdquo; feeling here.</p>
<p>One really <em>nice</em> aspect of this general direction, though, is that it
means that Polonius effectively doesn&rsquo;t care about these &ldquo;richer&rdquo;
constraints. The idea is that our subtyping and trait matching
algorithms can produce hereditary harrop region constraints (or some
subset thereof). These can be reduced to simpler subset constraints,
which are then passed to Polonius to do the final reasoning. (And, of
course, any of these steps may also produce an error.)</p>
<p>Comments, as usual, are requested in the <a href="https://internals.rust-lang.org/t/blog-post-an-alias-based-formulation-of-the-borrow-checker/7411">internals thread for this
blog post series</a>.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/nll" term="nll" label="NLL"/></entry><entry><title type="html">Polonius and region errors</title><link href="https://smallcultfollowing.com/babysteps/blog/2019/01/17/polonius-and-region-errors/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2019/01/17/polonius-and-region-errors/</id><published>2019-01-17T00:00:00+00:00</published><updated>2019-01-17T00:00:00+00:00</updated><content type="html"><![CDATA[<p>Now that NLL has been shipped, I&rsquo;ve been doing some work revisiting
<a href="https://github.com/rust-lang-nursery/polonius/">the Polonius project</a>. Polonius is the project that implements
<a href="https://smallcultfollowing.com/babysteps/
/blog/2018/04/27/an-alias-based-formulation-of-the-borrow-checker/">the &ldquo;alias-based formulation&rdquo; described in my older
blogpost</a>. Polonius has come a long way since that post; it&rsquo;s now
quite fast and also experimentally integrated into rustc, where it
passes the full test suite.</p>
<p>However, polonius as described is not complete. It describes the core
&ldquo;borrow check&rdquo; analysis, but there are a number of other checks that
the current implementation checks which polonius ignores:</p>
<ul>
<li>Polonius does not account for <strong>moves and initialization</strong>.</li>
<li>Polonius does not check for <strong>relations between named lifetimes</strong>.</li>
</ul>
<p>This blog post is focused on the second of those bullet points. It
covers the simple cases; hopefully I will soon post a follow-up that
targets some of the more complex cases that can arise (specifically,
dealing with higher-ranked things).</p>
<h3 id="brief-polonius-review">Brief Polonius review</h3>
<p>If you&rsquo;ve never read the <a href="https://smallcultfollowing.com/babysteps/
/blog/2018/04/27/an-alias-based-formulation-of-the-borrow-checker/">the original Polonius post</a>, you should probably
do so now. But if you have, let me briefly review some of the key details
that are relevant to this post:</p>
<ul>
<li>Instead of interpreting the <code>'a</code> notation as the <em>lifetime</em> of a
reference (i.e., a set of points), we interpret <code>'a</code> as a <em>set of
<strong>loans</strong></em>. We refer to <code>'a</code> as a &ldquo;region&rdquo;<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> in order to
emphasize this distinction.</li>
<li>We call <code>'a: 'b</code> a <strong>subset</strong> relation; it means that the loans in
<code>'a</code> must be a subset of the loans in <code>'b</code>. We track the required
subset relations at each point in the program.</li>
<li>A <strong>loan</strong> comes from some borrow expression like <code>&amp;foo</code>. A loan L0
is &ldquo;live&rdquo; if some live variable contains a region <code>'a</code> whose value
includes L0. When a loan is live, the &ldquo;terms of the loan&rdquo; must be
respected: for a shared borrow like <code>&amp;foo</code>, that means the path that
was borrowed (<code>foo</code>) cannot be mutated. For a mutable borrow, it
means that the path that was borrowed cannot be accessed at all.
<ul>
<li>If an access occurs that violates the terms of a loan, that is an
error.</li>
</ul>
</li>
</ul>
<h3 id="running-example-1">Running Example 1</h3>
<p>Let&rsquo;s give a quick example of some code that should result in an
error, but which would not if we only considered the errors that
polonius reports today:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="p">,</span><span class="w"> </span><span class="na">&#39;b</span><span class="o">&gt;</span><span class="p">(</span><span class="n">x</span>: <span class="kp">&amp;</span><span class="na">&#39;a</span> <span class="p">[</span><span class="kt">u32</span><span class="p">],</span><span class="w"> </span><span class="n">y</span>: <span class="kp">&amp;</span><span class="na">&#39;b</span> <span class="p">[</span><span class="kt">u32</span><span class="p">])</span><span class="w"> </span>-&gt; <span class="kp">&amp;</span><span class="na">&#39;a</span> <span class="kt">u32</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">&amp;</span><span class="n">y</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Here, we declared that we are returning a <code>&amp;u32</code> with lifetime <code>'a</code>
(i.e., borrowed from <code>x</code>) but in fact we are returning data with
lifetime <code>'b</code> (i.e., borrowed from <code>y</code>).</p>
<p>Slightly simplified, the MIR for this function looks something like
this.</p>
<pre tabindex="0"><code>fn foo(_1: &amp;&#39;a [u32], _2: &amp;&#39;b [u32]) -&gt; &amp;&#39;a [u32] {
  _0 = &amp;&#39;X (*_2)[const 0usize]; // S0
  return;                       // S1
}  
</code></pre><p>As you can see, there&rsquo;s only really one interesting statement; it
borrows from <code>_2</code> and stores the result into <code>_0</code>, which is the
special &ldquo;return slot&rdquo; in MIR.</p>
<p>In the case of the parameters <code>_1</code> and <code>_2</code>, the regions come directly
from the method signature. For regions appearing in the function body,
we create fresh region variables &ndash; in this case, only one, <code>'X</code>. <code>'X</code>
represents the region assigned to the borrow.</p>
<p>The relevant polonius facts for this function are as follows:</p>
<ul>
<li><code>base_subset('b, 'X, mid(S0))</code> &ndash; as <a href="https://rust-lang.github.io/rfcs/2094-nll.html#reborrow-constraints">described in the NLL
RFC</a>, &ldquo;re-borrowing&rdquo; the referent of a reference (i.e.,
<code>*_2</code>) creates a subset relation between the region of the region
(here, <code>'b</code>) and the region of the borrow (here, <code>'X</code>). Written in
the notation of the [NLL RFC], this would be the relation <code>'X: 'b @ mid(S0)</code>.</li>
<li><code>base_subset('X, 'a, mid(S0))</code> &ndash; the borrow expression in S0
produces a result of type <code>&amp;'X u32</code>. This is then assigned to <code>_0</code>,
which has the type <code>&amp;'a [u32]</code>.  The <a href="https://rust-lang.github.io/rfcs/2094-nll.html#subtyping">subtyping rules</a>
require that <code>'X: 'a</code>.</li>
</ul>
<p>Combining the two <code>base_subset</code> relations allows us to conclude that
the full subset relation includes <code>subset('b, 'a, mid(S0))</code> &ndash; that
is, for the function to be valid, the region <code>'b</code> must be a subset of
the region <code>'a</code>. This is an error because the regions <code>'a</code> and <code>'b</code>
are actually parameters to <code>foo</code>; in other words, <code>foo</code> must be valid
for <em>any</em> set of regions <code>'a</code> and <code>'b</code>, and hence we cannot know if
there is a subset relationship between them. <strong>This is a different
sort of error than the &ldquo;illegal access&rdquo; errors that Polonius reported
in the past:</strong> there is no access at all, in fact, simply subset
relations.</p>
<h3 id="placeholder-regions">Placeholder regions</h3>
<p>There is an important distinction between named regions like <code>'a</code> and
<code>'b</code> and the region <code>'X</code> we created for a borrow. The definition of
<code>foo</code> has to be true <strong>for all</strong> regions <code>'a</code> and <code>'b</code>, but for a
region like <code>'X</code> there only has to be <em>some</em> valid value. This
difference is often called being <em>universally quantified</em> (true for
all regions) versus <em>existentially quantified</em> (true for <em>some</em>
region).</p>
<p>In this post, I will call universally quantified regions like <code>'a</code> and
<code>'b</code> <strong>&ldquo;placeholder&rdquo; regions</strong>. This is because they don&rsquo;t really
represent a known quantity of loans, but rather a kind of
&ldquo;placeholder&rdquo; for some unknown set of loans.</p>
<p>We will include a base fact that helps us to identify placeholder regions:</p>
<pre tabindex="0"><code>.decl placeholder_region(R1: region)
.input placeholder_region
</code></pre><p>This fact is true for any placeholder region. So in our example we might have</p>
<pre tabindex="0"><code>placeholder_region(&#39;a).
placeholder_region(&#39;b).
</code></pre><p>Note that the actual polonius impl already includes a relation like
this<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>, because we need to account for the fact that
placeholder regions are &ldquo;live&rdquo; at all points in the control-flow
graph, as we always assume there may be future uses of them that we
cannot see.</p>
<h3 id="representing-known-relations">Representing known relations</h3>
<p>Even placeholder regions are not <em>totally</em> unknown though. The
function signature will often include where clauses (or implied
bounds) that indicate some known relationships between placeholder
regions. For example, if <code>foo</code> included a where clause like <code>where 'b: 'a</code>, then it would be perfectly legal.</p>
<p>We can represent the known relationships using an input:</p>
<pre tabindex="0"><code>.decl known_base_subset(R1: region, R2: region)
.input known_base_subset
</code></pre><p>Naturally these known relations are transitive, so we can define a
<code>known_subset</code> rule to encode that:</p>
<pre tabindex="0"><code>.decl known_subset(R1: region, R2: region)

known_subset(R1, R2) :- known_base_subset(R1, R2).
known_subset(R1, R3) :- known_base_subset(R1, R2), known_subset(R2, R3).
</code></pre><p>In our example of <code>foo</code>, there are no where clauses nor implied
bounds, so these relations are empty. If there were a where clause
like <code>where 'b: 'a</code>, however, then we would have a
<code>known_base_subset('b, 'a)</code> fact. Similarly, per out implied bounds
rules, such an input fact might be derived from an argument with a
type like <code>&amp;'a &amp;'b u32</code>, where there are &rsquo;nested&rsquo; regions.</p>
<h3 id="detecting-illegal-subset-relations">Detecting illegal subset relations</h3>
<p>We can now extend the polonius rules to report errors for cases like
our running example. The basic idea is this: if the function requires
a subset relationship <code>'r1: 'r2</code> between two placeholder regions <code>'r1</code>
and <code>'r2</code>, then it must be a &ldquo;known subset&rdquo;, or else we have an error.
We can encode this like so:</p>
<pre tabindex="0"><code>.decl subset_error(R1: region, R2: region, P:point)

subset_error(R1, R2, P) :-
  subset(R1, R2, P),      // `R1: R2` required at `P`
  placeholder_region(R1), // `R1` is a placeholder
  placeholder_region(R2), // `R2` is also a placeholder
  !known_subset(R1, R2).  // `R1: R2` is not a &#34;known subset&#34; relation.
</code></pre><p>In our example program, we can clearly derive <code>subset_error('b, 'a, mid(S0))</code>,
and hence we have an error:</p>
<ul>
<li>we saw earlier that <code>subset('a, 'b, mid(S0))</code> holds</li>
<li>as <code>'a</code> is a placeholder region, <code>placeholder_region('a)</code> will
appear in the input (same for <code>'b</code>)</li>
<li>finally, the <code>known_base_subset</code> (and hence <code>known_subset</code>) relation
in our example is empty</li>
</ul>
<p><strong>Sidenote on negative reasoning and stratification.</strong> This rule makes
use of negative reasoning in the form of the <code>!known_subset(R1, R2)</code>
predicate. Negative reasoning is fine in datalog so long as the
program is &ldquo;stratified&rdquo; &ndash; in particular, we must be able to compute
the entire <code>known_subset</code> relation without having to compute
<code>subset_error</code>. In this case, the program is trivialy stratified &ndash;
<code>known_subset</code> depends only on the input relation
<code>known_base_subset</code>.)</p>
<h3 id="observation-about-borrowing-local-data">Observation about borrowing local data</h3>
<p>It is interesting to walk through a different example. This is another
case where we expect an error, but in this case the error arises
because we are returning a reference to the stack:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">bar</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span><span class="p">(</span><span class="n">x</span>: <span class="kp">&amp;</span><span class="na">&#39;a</span> <span class="p">[</span><span class="kt">u32</span><span class="p">])</span><span class="w"> </span>-&gt; <span class="kp">&amp;</span><span class="na">&#39;a</span> <span class="kt">u32</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">stack_slot</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">x</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">&amp;</span><span class="n">stack_slot</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Polonius will report an error for this case, but not because of the
mechanisms in this blog post. What happens instead is that we create a
loan for the borrow expression <code>&amp;stack_slot</code>, we&rsquo;ll call it <code>L0</code>. When
the borrow is returned, this loan <code>L0</code> winds up being a member of the
<code>'a</code> region.  It is therefore &ldquo;live&rdquo; when the storage for <code>stack_slot</code>
is popped from the stack, which is an error: you can&rsquo;t pop the storage
for a stack slot where there are live loans that have reference it.</p>
<h3 id="conclusion">Conclusion</h3>
<p>This post describes a simple extension to the polonius rules that
covers errors arising from subset relations. Unlike the prior rules,
these errors are not triggered by any &ldquo;access&rdquo;, but rather simply the
creation of a (transitive) subset relation between two placeholder
regions.</p>
<p>Unfortunately, this is not the complete story around region checking
errors. In particular, this post ignored subset relations that can
arise from &ldquo;higher-ranked&rdquo; types like <code>for&lt;'a&gt; fn(&amp;'a u32)</code>. Handling
these properly requires us to introduce a bit more logic and will be
covered in a follow-up.</p>
<p>Comments, if any, should be posted in <a href="https://internals.rust-lang.org/t/blog-post-an-alias-based-formulation-of-the-borrow-checker/7411">the internals thread dedicated to my previous
polonius post</a></p>
<h3 id="appendix-a-potentially-more-efficient-formulation">Appendix: A (potentially) more efficient formulation</h3>
<p>The <code>subset_error</code> formulation above relied on the transitive <code>subset</code>
relation to work, because we wanted to report errors any time that one
placeholder wound up being forced to be a subset of another. In the
more optimized polonius implementations, we don&rsquo;t compute the full
transitive relation, so it might be useful to create a new relation
<code>subset_placeholder</code> that is specific to placeholder regions:</p>
<pre tabindex="0"><code>.decl subset_placeholder(R1: region, R2: region, P:point)
</code></pre><p>The idea is that <code>subset_placeholder(R1, R2, P)</code> means that, at the
point P, we know that <code>R1: R2</code> must hold, where <code>R1</code> is a placeholder.
You can express this via a &ldquo;base&rdquo; rule:</p>
<pre tabindex="0"><code>subset_placeholder(R1, R2, P) :-
  subset(R1, R2, P),      // `R1: R2` required at `P`
  placeholder_region(R1). // `R1` is a placeholder
</code></pre><p>and a transitive rule:</p>
<pre tabindex="0"><code>subset_placeholder(R1, R3, P) :-
  subset_placeholder(R1, R2, P), // `R1: R2` at P where `R1` is a placeholder
  subset(R2, R3, P).      // `R2: R3` required at `P`
</code></pre><p>Then we reformulate the <code>subset_error</code> rule to be based on <code>subset_placeholder</code>:</p>
<pre tabindex="0"><code>.decl subset_error(R1: region, R2: region, P:point)

subset_error(R1, R2, P) :-
  subset_placeholder(R1, R2, P), // `R1: R2` required at `P`
  placeholder_region(R2), // `R2` is also a placeholder
  !known_subset(R1, R2).  // `R1: R2` is not a &#34;known subset&#34; relation.
</code></pre><h3 id="footnotes">Footnotes</h3>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>The term &ldquo;region&rdquo; is not an especially good fit, but it&rsquo;s common in academia.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Currently called <code>universal_region</code>, though I plan to rename it.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/nll" term="nll" label="NLL"/></entry><entry><title type="html">Rust in 2019: Focus on sustainability</title><link href="https://smallcultfollowing.com/babysteps/blog/2019/01/07/rust-in-2019-focus-on-sustainability/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2019/01/07/rust-in-2019-focus-on-sustainability/</id><published>2019-01-07T00:00:00+00:00</published><updated>2019-01-07T00:00:00+00:00</updated><content type="html"><![CDATA[<p>To me, 2018 felt like a big turning point for Rust, and it wasn&rsquo;t just
the edition. Suddenly, it has become &ldquo;normal&rdquo; for me to meet people
using Rust at their jobs. Rust conferences are growing and starting to
have large number of sponsors. Heck, I even met some professional Rust
developers amongst the parents at a kid&rsquo;s birthday party
recently. Something has shifted, and I like it.</p>
<p>At the same time, I&rsquo;ve also noticed a lot of exhaustion. I know I feel
it &ndash; and a lot of people I talk to seem to feel the same way. It&rsquo;s
great that so much is going on in the Rust world, but we need to get
better at scaling our processes up and processing it effectively.</p>
<p>When I think about a &ldquo;theme&rdquo; for 2019, the word that keeps coming to
mind for me is <strong>sustainability</strong>. I think Rust has been moving at a
breakneck pace since 1.0, and that&rsquo;s been great: it&rsquo;s what Rust
needed. But as Rust gains more solid footing out there, it&rsquo;s a good
idea for us to start looking for how we can go back and tend to the
structures we&rsquo;ve built.</p>
<h3 id="sustainable-processes">Sustainable processes</h3>
<p>There has been a lot of great constructive criticism of our current
processes: most recently, boat&rsquo;s post on <a href="https://boats.gitlab.io/blog/post/rust-2019/">organizational debt</a>, along
with <a href="https://yakshav.es/rust-2019/">Florian&rsquo;s series of posts</a>, did a great job of crystallizing
a lot of the challenges we face. I am pretty confident that we can
adjust our processes here and make things a lot better, though
obviously some of these problems have no easy solution.</p>
<p>Obviously, I don&rsquo;t know exactly what we should do here. But I think I
see some of the pieces of the puzzle. Here is a variety of bullet
points that have been kicking around in my head.</p>
<p><strong>Working groups.</strong> In general, I would like to see us adopting the
idea of <strong>working groups</strong> as a core &ldquo;organizational unit&rdquo; for Rust,
and in particular as the core place where work gets done. A working
group is an ad-hoc set of people that includes both members of the
relevant Rust team but also interested volunteers. Among other
benefits, they can be a great vehicle for mentoring, since it gives
people a particular area to focus on, versus trying to participate in
the Rust project as a whole, which can be very overwhelming.</p>
<p><strong>Explicit stages.</strong> Right now, Rust features go through a number of
official and semi-official stages before they become &ldquo;stable&rdquo;. As I
have <a href="http://smallcultfollowing.com/babysteps/blog/2018/06/20/proposal-for-a-staged-rfc-process/">argued before</a>, I think we would benefit from making
these stages a more explicit part of the process (much as e.g. the
<a href="https://github.com/tc39/proposals">TC39</a> and <a href="https://github.com/WebAssembly/proposals">WebAssembly</a> groups already do).</p>
<p><strong>Finishing what we start.</strong> Right now, we have no mechanism to expose
the &ldquo;capacity&rdquo; of our teams &ndash; we tend to, for example, accept RFCs
without any idea who will implement it, or even mentor an
implementation. In fact, there isn&rsquo;t really a defined set of people to
try and ensure that it happens. The result is that a lot of things
linger in limbo, either unimplemented, undocumented, or unstabilized.
<strong>I think working groups can help to solve this, by having a core
leadership team that is committed to seeing the feature through</strong>.</p>
<p><strong>Expose capacity.</strong> Continuing the previous point, I think we should
integrate a notion of capacity into the staging process: so that we
avoid moving too far in the design until we have some idea who is
going to be implementing (or mentoring an implementation). If that is
hard to do, then it indicates we may not have the capacity to do this
idea right now &ndash; <strong>if that seems unacceptable, then we need to find
something else to stop doing</strong>.</p>
<p><strong>Don&rsquo;t fly solo.</strong> One of the things that we discussed in <a href="https://internals.rust-lang.org/t/compiler-steering-meeting/8588/16?u=nikomatsakis">a recent
compiler team steering
meeting</a>
is that being the leader of a working group is <strong>super stressful</strong> &ndash;
it&rsquo;s a lot to manage!  However, being a <strong>co-leader</strong> of a working
group is very different. Having someone else (or multiple someones)
that you can share work with, bounce ideas off of, and so forth makes
all the difference. It&rsquo;s also a great mentoring opportunities, as the
leaders of working groups don&rsquo;t necessarily have to be full members of
the team (yet). Part of exposing capacity, then, is trying to ensure
that we don&rsquo;t just have one person doing any one thing &ndash; we have
multiple. <strong>This is scary: we will get less done. But we will all be
happier doing it.</strong></p>
<p><strong>Evaluate priorities regularly.</strong> In my ideal world, we would make it
very easy to find out what each person on a team is working on, but we
would also have regular points where we evaluate whether those are the
right things. Are they advancing our roadmap goals? Did something else
more promising arise in the meantime? Part of the goal here is to
<strong>leave room for serendipity</strong>: maybe some random person came in from
the blue with an interesting language idea that seems really cool. We
want to ensure we aren&rsquo;t too &ldquo;locked in&rdquo; to pursue that
idea. Incidentally, this is another benefit to not &ldquo;flying solo&rdquo; &ndash; if
there are multiple leaders, then we can shift some of them around
without necessarily losing context.</p>
<p><strong>Keeping everyone in sync.</strong> Finally, I think we need to think hard
about how to help keep people in sync. The narrow focus of working
groups is great, but it can be a liability. We need to develop regular
points where we issue &ldquo;public-facing&rdquo; updates, to help keep people
outside the working group abreast of the latest developments.  I
envision, for example, meetings where people give an update on what&rsquo;s
been happening, the key decision and/or controversies, and seek
feedback on interesting points. We should probably tie these to the
stages, so that ideas cannot progress forward unless they are also
being communicated.</p>
<p><strong>TL;DR.</strong> The points above aren&rsquo;t really a coherent proposal yet,
though there are pieces of proposals in them. Essentially I am calling
for a bit more structure and process, so that it is clearer what we
are doing <em>now</em> and it&rsquo;s more obvious when we are making decisions
about what we should do <em>next</em>. I am also calling for more redundancy.
I think that both of these things will initially mean that we do fewer
things, but we will do them more carefully, and with less stress.  And
ultimately I think they&rsquo;ll pay off in the form of a larger Rust team,
which means we&rsquo;ll have more capacity.</p>
<h3 id="sustainable-technology">Sustainable technology</h3>
<p>So what about the technical side of things? I think the &ldquo;sustainable&rdquo;
theme fits here, too. I&rsquo;ve been working on rustc for 7 years now
(wow), and in all of that time we&rsquo;ve mostly been focused on &ldquo;getting
the next goal done&rdquo;. This is not to say that nobody ever cleans things
up; there have been some pretty epic refactoring PRs. But we&rsquo;ve also
accumulated a fair amount of technical debt. We&rsquo;ve got plenty of
examples where a new system was added to replace the old &ndash; but only
90%, meaning that now we have two systems in use. This makes it harder
to learn how rustc works, and it makes us spend more time fixing bugs
and ICEs.</p>
<p>I would like to see us put a lot of effort into making rustc more
approachable and maintaineable. This means writing documentation, both
of the <a href="https://doc.rust-lang.org/nightly/nightly-rustc/rustc/?search=&amp;search=">rustdoc</a> and <a href="https://rust-lang.github.io/rustc-guide/">rustc-guide</a> variety. It also means finishing up
things we started but never quite finished, like replacing the
remaining uses of <a href="https://doc.rust-lang.org/nightly/nightly-rustc/syntax/ast/struct.NodeId.html"><code>NodeId</code></a> with the newer <a href="https://doc.rust-lang.org/nightly/nightly-rustc/rustc/hir/struct.HirId.html"><code>HirId</code></a>. In some cases,
it might mean rewriting whole subsystems, such as with the trait
system and chalk.</p>
<p>None of this means we can&rsquo;t get new toys. Cleaning up the trait system
implementation, for example, makes things like Generic Associated
Types (GATs) and specialization much easier. Finishing the transition
into the on-demand query system should enable better incremental
compilation as well as more complete parallel compilation (and better
IDE support). And so forth.</p>
<p>Finally, it seems clear that we need to continue our focus on reducing
compilation time. I think we have a lot of <a href="https://internals.rust-lang.org/t/next-steps-for-reducing-overall-compilation-time/8429/2?u=nikomatsakis">good avenues to
pursue</a> here, and frankly a lot of them are blocked on needing
to improve the compiler&rsquo;s internal structure.</p>
<h3 id="sustainable-finances">Sustainable finances</h3>
<p>When one talks about sustainability, that naturally brings to mind the
question of financial sustainability as well. Mozilla has been the
primary corporate sponsor of Rust for some time, but we&rsquo;re starting to
see more and more sponsorship from other companies, which is
great. This comes in many forms: both Google and Buoyant have been
sponsoring people to work on the async-await and Futures proposals,
for example (and perhaps others I am unaware of); other companies have
used contracting to help get work done that they need; and of course
many companies have been sponsoring Rust conferences for years.</p>
<p>Going into 2019, I think we need to open up new avenues for supporting
the Rust project financially. As a simple example, having more money
to help with running CI could enable us to parallelize the bors queue
more, which would help with reducing the time to land PRs, which in
turn would help everything move faster (not to mention improving the
experience of contributing to Rust).</p>
<p>I do think this is an area where we have to tread carefully. I&rsquo;ve
definitely heard horror stories of &ldquo;foundations gone wrong&rdquo;, for
example, where decisions came to be dominated more by politics and
money than technical criteria. There&rsquo;s no reason to rush into things.
We should take it a step at a time.</p>
<p>From a personal perspective, I would love to see more people paid to
work part- or full-time on rustc. I&rsquo;m not sure how best to make that
happen, but I think it is definitely important. It has happened more
than once that great rustc contributors wind up taking a job elsewhere
that leaves them no time or energy to continue contributing. These
losses can be pretty taxing on the project.</p>
<h3 id="reference-material">Reference material</h3>
<p>I already mentioned that I think the compiler needs to put more
emphasis on documentation as a means for better sustainability. I
think the same also applies to the language: I&rsquo;d like to see the lang
team getting more involved with the Rust Reference and really trying
to fill in the gaps. I&rsquo;d also like to see the Unsafe Code Guidelines
work continue. I think it&rsquo;s quite likely that these should be roadmap
items in their own right.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/></entry><entry><title type="html">After NLL: Moving from borrowed data and the sentinel pattern</title><link href="https://smallcultfollowing.com/babysteps/blog/2018/11/10/after-nll-moving-from-borrowed-data-and-the-sentinel-pattern/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2018/11/10/after-nll-moving-from-borrowed-data-and-the-sentinel-pattern/</id><published>2018-11-10T00:00:00+00:00</published><updated>2018-11-10T00:00:00+00:00</updated><content type="html"><![CDATA[<p>Continuing on with my “After NLL” series, I want to look at another
common error that I see and its solution: today’s choice is about moves
from borrowed data and the <em>Sentinel Pattern</em> that can be used to enable
them.</p>
<h1 id="the-problem">The problem</h1>
<p>Sometimes when we have <code>&amp;mut</code> access to a struct, we have a need to
<em>temporarily</em> take ownership of some of its fields. Usually what happens
is that we want to move out from a field, construct something new using
the old value, and then replace it. So for example imagine we have a
type <code>Chain</code>, which implements a simple linked list:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">enum</span> <span class="nc">Chain</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">Empty</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">Link</span><span class="p">(</span><span class="nb">Box</span><span class="o">&lt;</span><span class="n">Chain</span><span class="o">&gt;</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Chain</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">with</span><span class="p">(</span><span class="n">next</span>: <span class="nc">Chain</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Chain</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Chain</span>::<span class="n">Link</span><span class="p">(</span><span class="nb">Box</span>::<span class="n">new</span><span class="p">(</span><span class="n">next</span><span class="p">))</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now suppose we have a struct <code>MyStruct</code> and we are trying to add a link
to our chain; we might have something like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">MyStruct</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">counter</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">chain</span>: <span class="nc">Chain</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">MyStruct</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">add_link</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">self</span><span class="p">.</span><span class="n">chain</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Chain</span>::<span class="n">with</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">chain</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now, if we try to <a href="https://play.rust-lang.org/?version=stable&amp;mode=debug&amp;edition=2015&amp;gist=896551436908a5a7b8b76d5f5ace54af">run this code</a>,
we will receive the following error:</p>
<pre tabindex="0"><code>error[E0507]: cannot move out of borrowed content
 --&gt; ex1.rs:7:30
  |
7 |     self.chain = Chain::with(self.chain);
  |                              ^^^^ cannot move out of borrowed content
</code></pre><p>The problem here is that we need to <em>take ownership</em> of <code>self.chain</code>,
but you can only take ownership of things that you own. In this case, we
only have /borrowed/ access to <code>self</code>, because <code>add_link</code> is declared as
<code>&amp;mut self</code>.</p>
<p>To put this as an analogy, it is as if you had borrowed a really nifty
Lego building that your friend made so you could admire it. Then, later,
you are building your own Lego thing and you realize you would like to
take some of the pieces from their building and put them into yours. But
you can’t do that – those pieces belong to your friend, not you, and
that would leave a hole in their building.</p>
<p>Still, this is kind of annoying – after all, if we look at the larger
context, although we are moving <code>self.chain</code>, we are going to replace it
shortly thereafter. So maybe it’s more like – we want to take some
blocks from our friend’s Lego building, but not to put them into our
/own/ building. Rather, we were going to take it apart, build up
something new with a few extra blocks, and then put that new thing back
in the same spot – so, by the time they see their building again, the
“hole” will be all patched up.</p>
<h3 id="root-of-the-problem-panics">Root of the problem: panics</h3>
<p>You can imagine us doing a static analysis that permits you to take
ownership of <code>&amp;mut</code> borrowed data, as long as we can see that it will be
replaced before the function returns. There is one little niggly problem
though: can we be <em>really sure</em> that we are going to replace
<code>self.chain</code>? It turns out that we can’t, because of the possibility of
panics.</p>
<p>To see what I mean, let’s take that troublesome line and expand it out
so we can see all the hidden steps. The original line was this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="bp">self</span><span class="p">.</span><span class="n">chain</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Chain</span>::<span class="n">with</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">chain</span><span class="p">);</span><span class="w">
</span></span></span></code></pre></div><p>which we can expand to something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">tmp0</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">chain</span><span class="p">;</span><span class="w">        </span><span class="c1">// 1. move `self.chain` out
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">tmp1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Chain</span>::<span class="n">with</span><span class="p">(</span><span class="n">tmp0</span><span class="p">);</span><span class="w"> </span><span class="c1">// 2. build new link
</span></span></span><span class="line"><span class="cl"><span class="bp">self</span><span class="p">.</span><span class="n">chain</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">tmp1</span><span class="p">;</span><span class="w">            </span><span class="c1">// 3. replace with `tmp2`
</span></span></span></code></pre></div><p>Written this way, we can see that in between moving <code>self.chain</code> out and
replacing it, there is a function call: <code>Chain::with</code>. And of course it
is possible for this function call to <em>panic</em>, at least in principle. If
it were to panic, then the stack would start unwinding, and we would
never get to step 3, where we assign <code>self.chain</code> again. This means that
there might be a destructor somewhere along the way that goes to inspect
<code>self</code> – if it were to try to access <code>self.chain</code>, it would just find
uninitialized memory. Or, even worse, <code>self</code> might be located inside of
some sort of <code>Mutex</code> or something else, so even if <em>our thread</em> panics,
other threads might observe the hole.</p>
<p>To return to our Lego analogy<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>, it is as if – after we removed some
pieces from our friends Lego set – our parents came and made us go to
bed before we were able to finish the replacement piece.  Worse, our
friend’s parents came over during the night to pick up the set, and so
now when our friend gets it back, it has this big hole in it.</p>
<h3 id="one-solution-sentinel">One solution: sentinel</h3>
<p>In fact, there <em>is</em> a way to move out from an <code>&amp;mut</code> pointer – you can
use <a href="https://doc.rust-lang.org/std/mem/fn.replace.html">the function <code>std::mem::replace</code></a><sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>. <code>replace</code>
sidesteps the panic problem we just described because it requires you
to already have a new value at hand, so that we can move out from
<code>self.chain</code> and <em>immediately</em> put a replacement there.</p>
<p>Our problem here is that we need to do the move before we can construct
the replacement we want. So, one solution then is that we can put some
temporary, dummy value in that spot. I call this a <em>sentinel</em> value –
because it’s some kind of special value. In this particular case, one
easy way to get the code to compile would be to stuff in an empty chain
temporarily:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">chain</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">std</span>::<span class="n">mem</span>::<span class="n">replace</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">chain</span><span class="p">,</span><span class="w"> </span><span class="n">Chain</span>::<span class="n">Empty</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="bp">self</span><span class="p">.</span><span class="n">chain</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Chain</span>::<span class="n">with</span><span class="p">(</span><span class="n">chain</span><span class="p">);</span><span class="w">
</span></span></span></code></pre></div><p>Now the compiler is happy – after all, even if <code>Chain::with</code> panics,
it’s not a memory safety problem. If anybody happens to inspect
<code>self.chain</code> later, they won’t see uninitialized memory, they will see
an empty chain.</p>
<p>To return to our Lego analogy<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>, it’s as if, when we
remove the pieces from our friend’s Lego set, we immediately stuff in a
a replacement piece. It’s an ugly piece, with the wrong color and
everything, but it’s ok – because our friend will never see it.</p>
<h3 id="a-more-robust-sentinel">A more robust sentinel</h3>
<p>The compiler is happy, but are we happy? Perhaps we are, but there is
one niggling detail. We wanted this empty chain to be a kind of
“temporary value” that nobody ever observes – but can we be sure of
that? Actually, in this <em>particular</em> example, we can be fairly sure…
other than the possibility of panic (which certainly remains, but is
perhaps acceptable, since we are in the process of tearing things down),
there isn’t really much else that can happen before <code>self.chain</code> is
replaced.</p>
<p>But often we are in a situation where we need to take temporary
ownership and then invoke other <code>self</code> methods. Now, perhaps we expect
that these methods will never read from <code>self.chain</code> – in other words,
we have a kind of [interprocedural conflict].  For example, maybe to
construct the new chain we invoke <code>self.extend_chain</code> instead, which
reads <code>self.counter</code> and creates that many new links<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup> in the
chain:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">MyStruct</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">add_link</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">chain</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">std</span>::<span class="n">mem</span>::<span class="n">replace</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">chain</span><span class="p">,</span><span class="w"> </span><span class="n">Chain</span>::<span class="n">Empty</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">new_chain</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">extend_chain</span><span class="p">(</span><span class="n">chain</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">self</span><span class="p">.</span><span class="n">chain</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">new_chain</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">extend_chain</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">chain</span>: <span class="nc">Chain</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Chain</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="w"> </span><span class="n">_</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="mi">0</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">counter</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="n">chain</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Chain</span>::<span class="n">with</span><span class="p">(</span><span class="n">chain</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">chain</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now I would get a bit nervous. I <em>think</em> nobody ever observes this empty
chain, but how can I be <em>sure</em>? At some point, you would like to test
this hypothesis.</p>
<p>One solution here is to use a sentinel value that is otherwise invalid.
For example, I could change my <code>chain</code> field to store an
<code>Option&lt;Chain&gt;</code>, with the invariant that <code>self.chain</code> should <em>always</em> be
<code>Some</code>, because if I ever observe a <code>None</code>, it means that <code>add_link</code> is
in progress. In fact, there is a handy method on <code>Option</code> called <code>take</code>
that makes this quite easy to do:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">MyStruct</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">counter</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">chain</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="n">Chain</span><span class="o">&gt;</span><span class="p">,</span><span class="w"> </span><span class="c1">// &lt;-- new
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">MyStruct</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">add_link</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Equivalent to:
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// let link = std::mem::replace(&amp;mut self.chain, None).unwrap();
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">link</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">chain</span><span class="p">.</span><span class="n">take</span><span class="p">().</span><span class="n">unwrap</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">self</span><span class="p">.</span><span class="n">chain</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">Chain</span>::<span class="n">with</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">chain</span><span class="p">));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now, if I were to (for example) invoke <code>add_link</code> recursively, I would
get a panic, so I would at least be alerted to the problem.</p>
<p>The annoying part about this pattern is that I have to “acknowledge” it
every time I reference <code>self.chain</code>. In fact, we already saw that in the
code above, since we had to wrap the new value with <code>Some</code> when
assigning to <code>self.chain</code>. Similarly, to borrow the chain, we can’t just
do <code>&amp;self.chain</code>, but instead we have to do something like
<code>self.chain.as_ref().unwrap()</code>, as in the example below, which counts
the links in the chain:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">MyStruct</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">count_chain</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">usize</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">links</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">cursor</span>: <span class="kp">&amp;</span><span class="nc">Chain</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">chain</span><span class="p">.</span><span class="n">as_ref</span><span class="p">().</span><span class="n">unwrap</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">loop</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="k">match</span><span class="w"> </span><span class="n">cursor</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">Chain</span>::<span class="n">Empty</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="k">return</span><span class="w"> </span><span class="n">links</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">Chain</span>::<span class="n">Link</span><span class="p">(</span><span class="n">c</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="n">links</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="n">cursor</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">c</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>So, the pro of using <code>Option</code> is that we get stronger error detection.
The con is that we have an ergonomic penalty.</p>
<h3 id="observation-most-collections-do-not-allocate-when-empty">Observation: most collections do not allocate when empty</h3>
<p>One important detail when mucking about with sentinels: creating an
empty collection is generally “free” in Rust, at least for the standard
library. This is important because I find that the fields I wish to move
from are often collections of some kind or another. Indeed, even in our
motivating example here, the <code>Chain::Empty</code> sentinel is an “empty”
collection of sorts – but if the field you wish to move were e.g. a
<code>Vec&lt;T&gt;</code> value, then you could as well use <code>Vec::new()</code> as a sentinel
without having to worry about wasteful memory allocations.</p>
<h3 id="an-alternative-to-sentinels-prevent-unwinding-through-abort">An alternative to sentinels: prevent unwinding through abort</h3>
<p>There is a crate called <a href="https://crates.io/crates/take_mut"><code>take_mut</code></a> on crates.io that offers a
convenient alternative to installing a sentinel, although it does not
apply in all scenarios. It also raises some interesting questions
about <a href="https://smallcultfollowing.com/babysteps/
/blog/2016/10/02/observational-equivalence-and-unsafe-code/">“unsafe composability”</a> that worry me a bit, which I’ll
discuss at the end.</p>
<p>To use <code>take_mut</code> to solve this problem, we would rewrite our <code>add_link</code>
function as follows:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">add_link</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">take_mut</span>::<span class="n">take</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">chain</span><span class="p">,</span><span class="w"> </span><span class="o">|</span><span class="n">chain</span><span class="o">|</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="n">Chain</span>::<span class="n">with</span><span class="p">(</span><span class="n">chain</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The <a href="https://docs.rs/take_mut/0.2.2/take_mut/fn.take.html"><code>take</code></a> function works like so: first, it uses unsafe code to
move the value from <code>self.chain</code>, leaving uninitialized memory in its
place. Then, it gives this value to the closure, which in this case
will execute <code>Chain::with</code> and return a new chain. This new chain is
then installed to fill the hole that was left behind.</p>
<p>Of course, this begs the queston: what happens if the <code>Chain::with</code>
function panics? Since <code>take</code> has left a hole in the place of
<code>self.chain</code>, it is in a tough spot: the answer from the <code>take_mut</code>
library is that it will <em>abort the entire process</em>. That is, unlike with
a <code>panic</code>, there is no controlled shutdown. There is some precedent for
this: we do the same thing in the event of stack overflow, memory
exhaustion, and a “double panic” (that is, a panic that occurs when
unwinding another panic).</p>
<p>The idea of aborting the process is that, unlike unwinding, we are
guaranteeing that there are no more possible observers for that hole
in memory. Interestingly, in writing this article, I realized that
<em>aborting the process does not compose with some other unsafe
abstractions you might want</em>. Imagine, for example, that you had
memory mapped a file on disk and were supplying an <code>&amp;mut</code> reference
into that file to safe code. Or, perhaps you were using shared memory
between two processes, and had some kind of locked object in there –
after locking, you might obtain an <code>&amp;mut</code> into the memory of that
object. Put another way, if the <code>take_mut</code> crate is safe, that means
that an <code>&amp;mut</code> can never point to memory not ultimately “owned” by the
current process. I am not sure if that’s a good decision for us to
make – though perhaps the real answer is that we need to permit unsafe
crates to be a bit more declarative about the conditions they require
from other crates, as I talk a bit about in this older blog post on
<a href="https://smallcultfollowing.com/babysteps/
/blog/2016/10/02/observational-equivalence-and-unsafe-code/">observational equivalence</a>.</p>
<h3 id="my-recommenation">My recommenation</h3>
<p>I would advise you to use some variant of the sentinel pattern. I
personally prefer to use a “signaling sentinel”<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup> like <code>Option</code>
if it would be a bug for other code to read the field, unless the range
of code where the value is taken is very simple. So, in our <em>original</em>
example, where we just invoked <code>Chain::new</code>, I would not bother with an
<code>Option</code> – we can locally see that <code>self</code> does not escape. But in the
variant where we recursively invoke methods on <code>self</code>, I would, because
there it would be possible to recursively invoke <code>self.add_link</code> or
otherwise observe <code>self.chain</code> in this intermediate state.</p>
<p>It’s a <em>bit</em> annoying to use <code>Option</code> for this because it’s so explicit.
I’ve sometimes created a <code>Take&lt;T&gt;</code> type that wraps a <code>Option&lt;T&gt;</code> and
implements <code>DerefMut&lt;Target = T&gt;</code>, so it can transparently be used as a
<code>T</code> in most scenarios – but which will <code>panic</code> if you attempt to deref
the value while it is “taken”. This might be a nice library, if it
doesn’t exist already.</p>
<p>One other thing to remember: instead of using a sentinel, you may be
able to avoid moving altogether, and sometimes that’s better. For
example, if you have an <code>&amp;mut Vec&lt;T&gt;</code> and you need ownership of the
<code>T</code> values within, you can use the <a href="https://doc.rust-lang.org/std/vec/struct.Vec.html#method.drain"><code>drain</code></a> iterator method. The
only real difference from <a href="https://doc.rust-lang.org/std/vec/struct.Vec.html#method.drain"><code>drain</code></a> vs <a href="https://doc.rust-lang.org/std/iter/trait.IntoIterator.html#tymethod.into_iter"><code>into_iter</code></a> is that <a href="https://doc.rust-lang.org/std/vec/struct.Vec.html#method.drain"><code>drain</code></a>
leaves an empty iterator behind once iteration is complete.</p>
<p>(Similarly, if you are writing an API and have the option of choosing
between writing a <code>fn(self) -&gt; Self</code> sort of signature vs <code>fn(&amp;mut self)</code>, you might adopt the latter, as it gives your callers more
flexibility. But this is a bit subtle; it would make a good topic for
the <a href="https://rust-lang-nursery.github.io/api-guidelines/">Rust API guidelines</a>, but I didn’t find it there.)</p>
<h3 id="discussion">Discussion</h3>
<p>If you’d like to discuss something in this post, there is a <a href="https://users.rust-lang.org/t/blog-post-series-after-nll-whats-next-for-borrowing-and-lifetimes/21864">dedicated
thread on the users.rust-lang.org site</a>.</p>
<h3 id="appendix-a-possible-future-directions">Appendix A. Possible future directions</h3>
<p>Besides creating a more ergonomic library to replace the use of <code>Option</code>
as a sentinel, I can think of a few plausible extensions to the language
that would alleviate this problem somewhat.</p>
<h4 id="tracking-holes">Tracking holes</h4>
<p>The most obvious change is that we could plausibly extend the borrow
checker to permit moves out of an <code>&amp;mut</code>, so long as the value is
guaranteed to be replaced <em>before the function returns or panics</em>. The
“or panics” bit is the tricky part, of course.</p>
<p>Without any other extensions to the language, we would have to consider
virtually every operation to “potentially panic”, which would be pretty
limiting. Our “motivating example” from this post, for example, would
fail the test, because the <code>Chain::with</code> function – like any function –
might potentially panic. The main thing this would do is allow functions
like <code>std::mem::replace</code> and <code>std::mem::swap</code> to be written in safe
code, as well as other more complex rotations. Handy, but not earth
shattering.</p>
<p>If we wanted to go beyond that, we would have to start looking into
effect type systems, which allow us to annotate functions with things
like “does not panic” and so forth. I am pretty nervous about taking
that particular “step up” in complexity – though there may be other use
cases (for example, to enable FFI interoperability with things that
longjmp, we might want ways to for functions to declare whether they
panic and how anyway). But it feels like at best this will be a narrow
tool that we wouldn’t expect people to use broadly.</p>
<p>In order to avoid annotation, @eddyb has tossed around the idea of an
“auto trait”-style effect system. Basically, you would be able to state
that you want to take as argument a “closure that can never call the
function <code>X</code>” – in this case, that might mean “a closure that can never
invoke <code>panic!</code>”. The compiler would then do a conservative analysis of
the closure’s call graph to figure out if it works. This would then
permit a variant of the <code>take_mut</code> crate where we don’t have to worry
about aborting the process, because we know the closure never panics. Of
course, just like auto traits, this raises semver concerns – sure, your
function doesn’t panic <em>now</em>, but does that mean you promise never to
make it panic in the future?<sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup></p>
<h4 id="permissions-in-permissions-out">Permissions in, permissions out</h4>
<p>There is another possible answer as well. We might generalize Rust’s
borrowing system to express the idea of a “borrow that never ends” –
presently that’s not something we can express. The idea would be that a
function like <code>add_link</code> would take in an <code>&amp;mut</code> but somehow express
that, if a panic were to occur, the <code>&amp;mut</code> is fully invalidated.</p>
<p>I’m not particularly hopeful on this as a solution to this particular
problem. There is a lot of complexity to address and it just doesn’t
seem even close to worth it.</p>
<p>There are however some other cases where <em>similar</em> sorts of “permission
juggling” might be nice to express. For example, people sometimes want
the ability to have a variant on <code>insert</code> – basically a function that
inserts a <code>T</code> into a collection and then returns a shared reference <code>&amp;T</code>
to inserted data. The idea is that the caller can then go on to do other
“shared” operations on the map (e.g., other map lookups). So the
signature would look a little like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">SomeCollection</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">insert_then_get</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">data</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kp">&amp;</span><span class="nc">T</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This signature is of course valid in Rust today, but it has an
existing meaning that we can’t change. The meaning today is that the
function requires unique access to <code>self</code> – and that unique access has
to persist until we’ve finished using the return value. It’s precisely
this interpretation that makes methods like <a href="https://doc.rust-lang.org/std/sync/struct.Mutex.html#method.get_mut"><code>Mutex::get_mut</code></a> sound.</p>
<p>If we were to move in this direction, we might look to languages like
<a href="http://gallium.inria.fr/~fpottier/publis/bpp-mezzo-journal.pdf">Mezzo</a> for inspiration, which encode this notion of “permissions in, permissons
out” more directly<sup id="fnref:7"><a href="#fn:7" class="footnote-ref" role="doc-noteref">7</a></sup>. I’m definitely interested in
investigating this direction, particularly if we can use it to address
other proposed “reference types” like <code>&amp;out</code> (for taking references to
uninitialized memory which you must initialized), <code>&amp;move</code>, and so forth.
But this seems like a massive research effort, so it’s hard to predict
just what it would look like for Rust, and I don’t see us adopting this
sort of thing in the near to mid term.</p>
<h3 id="panic--abort-having-semantic-impact">Panic = Abort having semantic impact</h3>
<p>Shortly after I posted this, Gankro tweeted the following:</p>
<blockquote class="twitter-tweet" data-conversation="none" data-lang="en"><p lang="en" dir="ltr">[chanting in distance] <br/>Appendix A With Panic=Abort Having Semantic Impact</p>&mdash; Alexis Beingessner (@Gankro) <a href="https://twitter.com/Gankro/status/1061364298449108992?ref_src=twsrc%5Etfw">November 10, 2018</a></blockquote>
<p>I actually meant to talk about that, so I’m adding this quick section.
You may have noticed that panics and unwinding are a big thing in this
post. Unwinding, however, is only optional in Rust – many users choose
instead to convert panics into a hard abort of the entire process.
Presently, the type and borrow checkers do not consider this option in
any way, but you could imagine them taking it into account when deciding
whether a particular bit of code is safe, particularly in lieu of a more
fancy effect system.</p>
<p>I am not a big fan of this. For one thing, it seems like it would
encourage people to opt into “panic = abort” just to avoid a sentinel
value here and there, which would lead to more of a split in the
ecosystem. But also, as I noted when discussing the <code>take_mut</code> crate,
this whole approach presumes that an <code>&amp;mut</code> reference can only ever
refer to memory that is owned by the current process, and I’m not sure
that’s something we wish to state.</p>
<p>Still, food for thought.</p>
<h3 id="footnotes">Footnotes</h3>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>I really like this Lego analogy. You’ll just have to bear with me.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p><code>std::mem::replace</code> is a super useful function in all kinds of scenarios; worth having in your toolbox.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>OK, maybe I’m taking this analogy too far. Sorry. I need help.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>I bet you were wondering what that <code>counter</code> field was for – gotta admire that [Chekhov’s Gun] action.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>i.e., some sort of sentinel where a panic occurs if the memory is observed&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p>It occurs to me that we now have a corpus of crates at various versions. It would be interesting to see how common it is to make something panic which did not used to, as well sa to make other sorts of changes.&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:7">
<p>Also related: [fractional permissions] and a whole host of other things.
[fractional permissions]: <a href="https://pdfs.semanticscholar.org/f744/e6fe7b8d9f92205d3a407e0446369c5f02bd.pdf">https://pdfs.semanticscholar.org/f744/e6fe7b8d9f92205d3a407e0446369c5f02bd.pdf</a>&#160;<a href="#fnref:7" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content></entry><entry><title type="html">Splash 2018 Mid-Week Report</title><link href="https://smallcultfollowing.com/babysteps/blog/2018/11/08/splash-2018-mid-week-report/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2018/11/08/splash-2018-mid-week-report/</id><published>2018-11-08T00:00:00+00:00</published><updated>2018-11-08T00:00:00+00:00</updated><content type="html"><![CDATA[<p>This week I&rsquo;ve been attending SPLASH 2018. It&rsquo;s already been quite an
interesting week, and it&rsquo;s only just begun. I thought I&rsquo;d write up a
quick report on some of the things that have been particularly
interesting to me, and some of the ideas that they&rsquo;ve sparked off.</p>
<h3 id="teaching-programming-and-rust">Teaching programming (and Rust!)</h3>
<p>I really enjoyed this talk by Felienne Hermans entitled <a href="https://2018.splashcon.org/event/splash-2018-keynotes-explicit-direct-instruction-in-programming-education">&ldquo;Explicit
Direct Instruction in Programming Education&rdquo;</a>. The basic gist of
the talk was that, when we teach programming, we often phrase it in
terms of &ldquo;exploration&rdquo; and &ldquo;self-expression&rdquo;, but that this winds up
leaving a lot of folks in the cold and may be at least <em>partly</em>
responsible for the lack of diversity in computer science today. She
argued that this is like telling kids that they should just be able to
play a guitar and create awesome songs without first practicing their
chords<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> &ndash; it kind of sets them up to fail.</p>
<p>The thing that really got me excited about this was that it seemed
very connected to mentoring and open source. If you watched the Rust
Conf keynote this year, you&rsquo;ll remember Aaron talking about &ldquo;OSS by
Serendipity&rdquo; &ndash; this idea that we should just expect people to come
and produce PRs. This is in contrast to the &ldquo;OSS by Design&rdquo; that we&rsquo;ve
been trying to practice and preach, where there are explicit in-roads
for people to get involved in the project through mentoring, as well
as explicit priorities and goals (created, of course, through open
processes like the roadmap and so forth). It seems to me that the
things like working groups, intro bugs, quest issues, etc, are all
ways for people to &ldquo;practice the basics&rdquo; of a project before they dive
into creating major new features.</p>
<p>One other thing that Felienne talked about which I found exciting was
the idea that &ndash; in fields like reading &ndash; there are taxonomies of
common errors as well as diagnostic tools that one can use to figure
out where your student falls in this taxonomy. The idea is that you
can give a relatively simple quiz that will help you identify what
sorts of mental errors they are making, which you can then directly
target. (Later, I talked to someone &ndash; whose name unfortunately I do
not remember &ndash; doing similar research around how to categorize
mathematical errors which sounded quite cool.)</p>
<p>I feel like both the idea of &ldquo;practice&rdquo; but also of &ldquo;taxonomy of
errors&rdquo; applies to Rust quite well. Learning to use Rust definitely
involves a certain amount of &ldquo;drill&rdquo;, where one works with the
mechanics of the ownership/borrowing system until they start to feel
more natural. Moreover, I <em>suspect</em> that there are common &ldquo;stages of
understanding&rdquo; that we could try to quantify, and then directly target
with instructional material. To some extent we&rsquo;ve been doing this all
along, but it seems like something we could do more formally.</p>
<h3 id="borrow-checker-field-guide">Borrow checker field guide</h3>
<p>Yesterday I had a very nice conversation with Will Crichton and Anna
Zeng. They were presenting the results of work they have been doing to
identify barriers to adoption for Rust (<a href="https://2018.splashcon.org/event/plateau-2018-papers-identifying-barriers-to-adoption-for-rust-through-online-discourse">they have a paper you can
download here to learn more</a>). Specifically, they&rsquo;ve been
surveying comments, blog posts, and other things and looking for
patterns. I&rsquo;m pretty excited to dig deeper into their findings, and I think
that we should think about forming a working group or something else to continue
this line of work to help inform future directions.</p>
<p>Talking with them also helped to crystallize some of the thoughts I&rsquo;ve
been having with respect to this &ldquo;After NLL&rdquo; blog post series. What
I&rsquo;ve realized is that it is a bit tricky to figure out how to organize
the &ldquo;taxonomy of tricky situations&rdquo; that commonly result with
ownership as well as their solutions. For example, in reading the
responses to my previous post about <em>interprocedural conflicts</em>, I
realized that this one <em>fundamental conflict</em> can manifest in a number
of ways &ndash; and also that there are a number of possible solutions,
depending on the specifics of your scenario.</p>
<p>I&rsquo;ve decided for the time being to just press on, writing out various
blog posts that highlight &ndash; in a somewhat sloppy way &ndash; the kinds of
errors I see cropping up, some of the solutions I see available for
them, and also the possible language extensions we might pursue in the
future.</p>
<p>However, I think that once this series is done, it would be nice to
pull this material together (along with other things) into a kind of
<em>Borrow Checker Field Guide</em>. The idea would be to distinguish:</p>
<ul>
<li><strong>Root causes</strong> &ndash; there are relatively few of these, but these are the
root aspects of the borrow checker that give rise to errors.</li>
<li><strong>Troublesome patterns</strong> &ndash; these are the designs that people are often
shooting for from another language which can cause trouble in Rust.
Examples might be &ldquo;tree with parent pointer&rdquo;, &ldquo;graph&rdquo;, etc.</li>
<li><strong>Solutions</strong> &ndash; these would be solutions and design patterns that work
today to resolve problems.</li>
<li><strong>Proposals</strong> &ndash; in some cases, there might be links to proposed designs.</li>
</ul>
<p>The idea is that for the <strong>root causes</strong> and <strong>troublesome patterns</strong>,
there would be links over to the solutions and proposals that can help
resolve them. I don&rsquo;t intend to include a lot of details about
proposals in particular in this document, but I&rsquo;d like to see a way to
drive people towards the &ldquo;work in progress&rdquo; as well, so they can give
their feedback or maybe even get involved.</p>
<h3 id="an-eve-retrospective">An Eve retrospective</h3>
<p>Chris Granger gave an <a href="https://2018.splashcon.org/event/live-2018-papers-keynote">amazing and heartfelt talk about the Eve effort</a>
to construct a more accessible model of programming. I had not
realized what a monumental effort it was. I had two main takeaways:
first, that the <em>crux</em> of programming is <em>modeling and
feedback</em>. Excel works for so many people because it gives a simple
model you can fit your data into and immediate feedback &ndash; but it&rsquo;s
inflexible and limited. If we can change that up or scale it, it could
be a massive enabler. Second, that the VC approach of trying to change
the world over the course of a few ridiculously high-pressure years
sounds very punishing and doesn&rsquo;t feel just or right. It&rsquo;d be
wonderful if Chris and co. could continue their efforts without such
personal sacrifice. (As an aside, Chris had a few nice things to say
about Rust, which were much appreciated.)</p>
<h3 id="incremental-datalog-and-rust-type-checking">Incremental datalog and Rust type checking</h3>
<p>Finally, I spent some time talking to Sebastian Erdweg and Tamás Szabó
about <a href="https://2018.splashcon.org/event/splash-2018-splash-i-better-living-through-incrementality-immediate-static-analysis-feedback-without-loss-of-precision">their work on incremental datalog</a>. They had a very cool
demo of their system at work where they implemented various analyses
&ndash; unreachable code and out-of-bounds index detection &ndash; and showed
how quickly they could update as the input source changed. Sebastian
also has a master&rsquo;s student (whose name I don&rsquo;t know &ndash; yet! but I
will find out) that implemented a prototype Rust type checker in their
system; I look forward to reading more about it.</p>
<p>Their system is much finer grained than anything we&rsquo;ve attempted to do
in rustc. It seems like we could easily port Polonius to that system
and see how well it works, though it seems like it would also make
sense to compare against Frank McSherry&rsquo;s amazing differential-datalog
system.</p>
<p>I&rsquo;ve been thinking for some time that the next frontier in terms of
improving rustc is to start formalizing and simplifying name
resolution and type-checking. Talking to them did get me more inspired
to see that work proceed, since it could well be the foundation for
super snappy IDE integration. (But first: got to see Polonius and
Chalk over the finish line!)</p>
<h3 id="logic">Logic</h3>
<p>Finally, I had a very long conversation with Will Byrd and Michael
Ballantyne about how <a href="https://github.com/rust-lang-nursery/chalk">Chalk</a> works. We discussed some details of how
MiniKanren&rsquo;s search algorithm works and also some details how Chalk
lowering works. I won&rsquo;t try to summarize here, except to say that Will
gave me an exciting pointer to something called <a href="https://en.wikipedia.org/wiki/Default_logic">&ldquo;default logic&rdquo;</a>,
which I had never heard of, but which seems like a very good match for
specialization. I look forward to reading more about it.</p>
<h3 id="me-talking-about-rust">Me talking about Rust</h3>
<p>I am posting this in the morning on Thursday. Today I am going to give
a talk about Rust &ndash; I plan to focus on both some technical aspects
but also how Rust governance works and some of the &ldquo;ins and outs&rdquo; of
running an open source project. I&rsquo;m excited but a bit nervous, since
this is material that I&rsquo;ve never tried to present to a general
audience before. Let&rsquo;s see how it goes![^video] (Tonight there is also a joint
Rust-Scala meetup, so that should be fun.)</p>
<h3 id="footnotes">Footnotes</h3>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>My daughter insists she can do this&hellip; let&rsquo;s just say she&rsquo;s lucky she&rsquo;s so cute. =)&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/conference" term="conference" label="Conference"/><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/></entry><entry><title type="html">After NLL: Interprocedural conflicts</title><link href="https://smallcultfollowing.com/babysteps/blog/2018/11/01/after-nll-interprocedural-conflicts/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2018/11/01/after-nll-interprocedural-conflicts/</id><published>2018-11-01T00:00:00+00:00</published><updated>2018-11-01T00:00:00+00:00</updated><content type="html"><![CDATA[<p>In my previous post on the status of NLL, I promised to talk about
&ldquo;What is next?&rdquo; for ownership and borrowing in Rust. I want to lay out
the various limitations of Rust&rsquo;s ownership and borrowing system that
I see, as well as &ndash; where applicable &ndash; current workarounds. I&rsquo;m
curious to get feedback on which problems affect folks the most.</p>
<p>The first limitation I wanted to focus on is <strong>interprocedural
conflicts</strong>.  In fact, I&rsquo;ve covered a special case of this before &ndash;
where a closure conflicts with its creator function &ndash; in my post on
<a href="https://smallcultfollowing.com/babysteps/
/blog/2018/04/24/rust-pattern-precise-closure-capture-clauses/">Precise Closure Capture Clauses</a>. But the problem is more
general.</p>
<h2 id="the-problem">The problem</h2>
<p>Oftentimes, it happens that we have a big struct that contains a
number of fields, not all of which are used by all the
methods. Consider a struct like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">use</span><span class="w"> </span><span class="n">std</span>::<span class="n">sync</span>::<span class="n">mpsc</span>::<span class="n">Sender</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">MyStruct</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">widgets</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">MyWidget</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">counter</span>: <span class="kt">usize</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">listener</span>: <span class="nc">Sender</span><span class="o">&lt;</span><span class="p">()</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">MyWidget</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Perhaps we have a method <code>increment</code> which increments the counter each
time some sort of event occurs. It also fires off a message to some
listener to let them know.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">MyStruct</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">signal_event</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">self</span><span class="p">.</span><span class="n">counter</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">self</span><span class="p">.</span><span class="n">listener</span><span class="p">.</span><span class="n">send</span><span class="p">(()).</span><span class="n">unwrap</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The problem arises when we try to invoke this method while we are
simultaneously using some of the other fields of <code>MyStruct</code>.  Suppose
we are &ldquo;checking&rdquo; our widgets, and this process might generate the
events we are counting; that might look like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">MyStruct</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">check_widgets</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="w"> </span><span class="n">widget</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="o">&amp;</span><span class="bp">self</span><span class="p">.</span><span class="n">widgets</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="k">if</span><span class="w"> </span><span class="n">widget</span><span class="p">.</span><span class="n">check</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">signal_event</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">  
</span></span></span></code></pre></div><p>Unfortunately, <a href="https://play.rust-lang.org/?version=beta&amp;mode=debug&amp;edition=2018&amp;gist=ef84f42cd30a8c110e6d5ce4eceac5df">this code is going to yield a compilation error</a>.
The error I get presently is:</p>
<pre tabindex="0"><code>error[E0502]: cannot borrow `*self` as mutable because it is also borrowed as immutable
  --&gt; src/main.rs:26:17
     |
  24 |         for widget in &amp;self.widgets {
     |                       -------------
     |                       |
     |                       immutable borrow occurs here
     |                       immutable borrow used here, in later iteration of loop
  25 |             if widget.check() {
  26 |                 self.signal_event();
     |                 ^^^^^^^^^^^^^^^^^^^ mutable borrow occurs here
</code></pre><p>What this message is trying to tell you<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> is that:</p>
<ul>
<li>During the loop, you are holding a borrow of <code>self.widgets</code>.</li>
<li>You are then giving away access to <code>self</code> in order to call <code>signal_event</code>.
<ul>
<li>The danger here is that <code>signal_event</code> may mutate <code>self.widgets</code>,
which you are currently iterating over.</li>
</ul>
</li>
</ul>
<p>Now, you and I know that <code>signal_event</code> is not going to touch the
<code>self.widgets</code> field, so there should be no problem here. But the
compiler doesn&rsquo;t know that, because it only examines one function at a
time.</p>
<h3 id="inlining-as-a-possible-fix">Inlining as a possible fix</h3>
<p>The <em>simplest</em> way to fix this problem is to modify <code>check_widgets</code>
to inline the body of <code>signal_event</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">MyStruct</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">check_widgets</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="w"> </span><span class="n">widget</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="o">&amp;</span><span class="bp">self</span><span class="p">.</span><span class="n">widgets</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="k">if</span><span class="w"> </span><span class="n">widget</span><span class="p">.</span><span class="n">check</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// Inline `self.signal_event()`:
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">counter</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">listener</span><span class="p">.</span><span class="n">send</span><span class="p">(()).</span><span class="n">unwrap</span><span class="p">();</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">  
</span></span></span></code></pre></div><p>Now the compiler can clearly see that distinct fields of <code>self</code> are
being used, so everything is hunky dory. Of course, now we&rsquo;ve created
a &ldquo;DRY&rdquo;-failure &ndash; we have two bits of code that know how to signal an
event, and they could easily fall out of sync.</p>
<h3 id="factoring-as-a-possible-fix">Factoring as a possible fix</h3>
<p>One way to address the DRY failure is to factor our types better.
For example, perhaps we can extract a <code>EventSignal</code> type and
move the <code>signal_event</code> method there:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">EventSignal</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">counter</span>: <span class="kt">usize</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">listener</span>: <span class="nc">Sender</span><span class="o">&lt;</span><span class="p">()</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">EventSignal</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">signal_event</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">self</span><span class="p">.</span><span class="n">counter</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">self</span><span class="p">.</span><span class="n">listener</span><span class="p">.</span><span class="n">send</span><span class="p">(()).</span><span class="n">unwrap</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now we can modify the <code>MyStruct</code> type to embed an <code>EventSignal</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">MyStruct</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">widgets</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">MyWidget</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">signal</span>: <span class="nc">EventSignal</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Finally, instead of writing <code>self.signal_event()</code>, we will write <code>self.signal.signal_event()</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">MyStruct</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">check_widgets</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="w"> </span><span class="n">widget</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">widgets</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="k">if</span><span class="w"> </span><span class="n">widget</span><span class="p">.</span><span class="n">update</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">signal</span><span class="p">.</span><span class="n">signal_event</span><span class="p">();</span><span class="w"> </span><span class="c1">// &lt;-- Changed
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">  
</span></span></span></code></pre></div><p><a href="https://play.rust-lang.org/?version=beta&amp;mode=debug&amp;edition=2018&amp;gist=6512551bf58cd66917895a588f3643dc">This code compiles fine</a>, since the compiler now sees access to
two distinct fields: <code>widgets</code> and <code>signal</code>. Moreover, we can invoke
<code>self.signal.signal_event()</code> from as many places as we want without
duplication.</p>
<p>Truth be told, factoring sometimes makes for cleaner code: e.g., in
this case, there was a kind of &ldquo;mini type&rdquo; hiding within <code>MyStruct</code>,
and it&rsquo;s nice that we can extract it. But definitely not always. It
can be more verbose, and I sometimes find that it makes things more
opaque, simply because there are now just more structs running around
that I have to look at.  Some things are so simple that the complexity
of having a struct outweights the win of isolating a distinct bit of
functionality.</p>
<p>The other problem with factoring is that it doesn&rsquo;t always work:
sometimes we have methods that each use a specific set of fields, but
those fields don&rsquo;t factor <em>nicely</em>. For example, if we return to our
original <code>MyStruct</code> (where everything was inlined), perhaps we might
have a method that used both <code>self.counter</code> and <code>self.widgets</code> but not
<code>self.listener</code> &ndash; the factoring we did can&rsquo;t help us identify a
function that uses <code>counter</code> but not <code>listener</code>.</p>
<h3 id="free-variables-as-a-general-but-extreme-solution">Free variables as a general, but extreme solution</h3>
<p>One very general way to sidestep our problem is to move things out of
method form and into a &ldquo;free function&rdquo;. The idea is that instead of
<code>&amp;mut self</code>, you will take a separate <code>&amp;mut</code> parameter for each field
that you use. So <code>signal_event</code> might look like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">signal_event</span><span class="p">(</span><span class="n">counter</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="kt">usize</span><span class="p">,</span><span class="w"> </span><span class="n">listener</span>: <span class="kp">&amp;</span><span class="nc">Sender</span><span class="o">&lt;</span><span class="p">()</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="o">*</span><span class="n">counter</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">listener</span><span class="p">.</span><span class="n">send</span><span class="p">(()).</span><span class="n">unwrap</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Then we would replace <code>self.signal_event()</code> with:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">signal_event</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">counter</span><span class="p">,</span><span class="w"> </span><span class="o">&amp;</span><span class="bp">self</span><span class="p">.</span><span class="n">listener</span><span class="p">)</span><span class="w">
</span></span></span></code></pre></div><p>Obviously, this is a significant ergonomic regression. However, it is
very effective at exposing the set of fields that will be accessed to
our caller.</p>
<p>Moving to a free function also gives us some extra flexibility. You
may have noted, for example, that the <code>signal_event</code> function takes a
<code>&amp;Sender&lt;()&gt;</code> and not a <code>&amp;mut Sender&lt;()&gt;</code>. This is because <a href="https://doc.rust-lang.org/std/sync/mpsc/struct.Sender.html#method.send">the <code>send</code>
method on <code>Sender</code> only requires <code>&amp;self</code></a>, so a shared borrow is
all we need. This means that we could invoke <code>signal_event</code> in some
location where we needed another shared borrow of <code>self.listener</code>
(perhaps another method or function).</p>
<h3 id="view-structs-as-a-general-but-extreme-solution">View structs as a general, but extreme solution</h3>
<p>I find moving to a free function to be ok in a pinch, but it&rsquo;s pretty
annoying if you have a lot of fields, or if the method you are
converting calls other methods (in which case you need to identify the
transitive set of fields). There is another technique I have used from
time to time, though it&rsquo;s fairly heavy weight. The idea is to define a
&ldquo;view struct&rdquo; which has all the same fields as the orignal, but it
uses references to identify if those fields are used in a &ldquo;shared&rdquo;
(immutable) or &ldquo;mutable&rdquo; way.</p>
<p>For example, we might define <code>CheckWidgetsView</code></p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">CheckWidgetsView</span><span class="o">&lt;</span><span class="na">&#39;me</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">widgets</span>: <span class="kp">&amp;</span><span class="na">&#39;me</span> <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">MyWidget</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">counter</span>: <span class="kp">&amp;</span><span class="na">&#39;me</span> <span class="nc">mut</span><span class="w"> </span><span class="kt">usize</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">listener</span>: <span class="kp">&amp;</span><span class="na">&#39;me</span> <span class="nc">mut</span><span class="w"> </span><span class="n">Sender</span><span class="o">&lt;</span><span class="p">()</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now we can define methods on the view without a problem:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="na">&#39;me</span><span class="o">&gt;</span><span class="w"> </span><span class="n">CheckWidgetsView</span><span class="o">&lt;</span><span class="na">&#39;me</span><span class="o">&gt;</span><span class="w">  </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">signal_event</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">*</span><span class="bp">self</span><span class="p">.</span><span class="n">counter</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">self</span><span class="p">.</span><span class="n">listener</span><span class="p">.</span><span class="n">send</span><span class="p">(()).</span><span class="n">unwrap</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">check_widgets</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="w"> </span><span class="n">widget</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="o">&amp;</span><span class="bp">self</span><span class="p">.</span><span class="n">widgets</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="k">if</span><span class="w"> </span><span class="n">widget</span><span class="p">.</span><span class="n">check</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">signal_event</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">  
</span></span></span></code></pre></div><p>You might wonder why this solved the problem. After all, the <code>check_widgets</code>
method here basically looks the same &ndash; the compiler still sees two overlapping
borrows:</p>
<ul>
<li>a shared borrow of <code>self.widgets</code>, in the for loop</li>
<li>a mutable borrow of <code>self</code>, when invoking <code>signal_event</code></li>
</ul>
<p>The difference here lies in the type of <code>self.widgets</code>: because it is
a <code>&amp;Vec&lt;MyWidget&gt;</code>, we already know that the vector we are iterating
over cannot change &ndash; that is, we are not giving away mutable access
to the iterator itself, just to a <em>reference to the iterator</em>. So
there is nothing that <code>signal_event</code> could do to mess up our
iteration.</p>
<p>(Note that if we needed to mutate the widgets as we iterated, this
&ldquo;view struct&rdquo; trick would not work here, and we&rsquo;d be back where we
started &ndash; or rather, we&rsquo;d need a new view struct just for
<code>signal_event</code>.)</p>
<p>One nice thing about view structs is that we can have more than one,
and we can change the set of fields that each part refers to. So, for
example, one sometimes has &ldquo;double buffering&rdquo;-like algorithms that use
one field for input and one field for output, but which field is used
alternates depending on the phase (and which perhaps use other fields
in a shared capacity). Using view struct(s) can handle this quite
elegantly.</p>
<h3 id="relation-to-closures">Relation to closures</h3>
<p>As I mentioned, one common place where this problem arises is actually
with closures. This occurs because closures always capture entire
local variables; so if a closure only uses some particular field of a
local, it can create an unnecessary conflict. For example:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">check_widgets</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// Make a closure that uses `self.counter`
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// and `self.listener`; but it will actually
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// capture all of `self`.
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="n">signal_event</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">self</span><span class="p">.</span><span class="n">counter</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">self</span><span class="p">.</span><span class="n">listener</span><span class="p">.</span><span class="n">send</span><span class="p">(()).</span><span class="n">unwrap</span><span class="p">();</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">for</span><span class="w"> </span><span class="n">widget</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="o">&amp;</span><span class="bp">self</span><span class="p">.</span><span class="n">widgets</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">if</span><span class="w"> </span><span class="n">widget</span><span class="p">.</span><span class="n">check</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="n">signal_event</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Even though it&rsquo;s an instance of the same general problem, it&rsquo;s worth
calling out specially, because it can be solved in different ways. In
fact, we&rsquo;ve accepted <a href="https://github.com/rust-lang/rfcs/pull/2229">RFC #2229</a>, which proposes to change the closure
desugaring. In this case, the closure would only capture
<code>self.counter</code> and <code>self.listener</code>, avoiding the problem.</p>
<h3 id="extending-the-language-to-solve-this-problem">Extending the language to solve this problem</h3>
<p>There has been discussion on and off about how to solve this problem.
Clearly, there is a need to permit methods to expose information about
which fields they access and how they access those fields, but it&rsquo;s not
clear what&rsquo;s the best way to do this. There are a number of tradeoffs at play:</p>
<ul>
<li>Adding more concepts to the surface language.</li>
<li>Core complexity; this probably involves extending the base borrow checker rules.</li>
<li>Annotation burden.</li>
<li>Semver considerations (see below).</li>
</ul>
<p>There is some discussion of the view idea in <a href="https://internals.rust-lang.org/t/having-mutability-in-several-views-of-a-struct/6882/2">this internals
thread</a>;
I&rsquo;ve also tinkered with the idea of merging views and traits, as
<a href="https://internals.rust-lang.org/t/fields-in-traits/6933/12">described in this internals
post</a>. I&rsquo;ve
also toyed with the idea of trying to infer some of this information
for private functions (or perhaps even crate-private functions), but I
think it&rsquo;d be best to start with some form of explicit syntax.</p>
<p><strong>Semver considerations.</strong> One of the things you&rsquo;ll notice about all
of the solutions to the problem is that they are all ways of exposing
information to the compiler about which fields will be used in
<code>signal_event</code> or (in the case of view structs) how they will be
used. This has semver implications: imagine you have a public function
<code>fn get(&amp;self) -&gt; &amp;Foo</code> that returns a reference to something in
<code>self</code>. If we now permit your clients to invoke other methods while
that borrow is live (because we know somehow that they won&rsquo;t
interfere), that is a semver commitment. The current version, where
your struct is considered an atomic unit, gives you maximal freedom to
change your implementation in the future, because it is maximally
conservative with respect to what your clients can do.</p>
<h3 id="conclusion">Conclusion</h3>
<p>The general problem here I think is being able to identify which
fields are used by a method (or set of methods) and how. I&rsquo;ve shown a
number of workarounds you can use today. I&rsquo;m interested to hear,
however, how often this problem affects you, and which (if any) of the
workarounds might have helped you. (As noted, I would break out
closures into their own subcategory of this problem, and one for which
we will hopefully have a solution sooner.)</p>
<p>To discuss this, I <a href="https://users.rust-lang.org/t/blog-post-series-after-nll-whats-next-for-borrowing-and-lifetimes/21864">have opened a thread on
<code>users.rust-lang.org</code></a>. Once
the &ldquo;What&rsquo;s next?&rdquo; series is done, I will also open a survey to gather
more quantitive feedback.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>This message is actually a bit confusing.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/nll" term="nll" label="NLL"/></entry><entry><title type="html">MIR-based borrowck is almost here</title><link href="https://smallcultfollowing.com/babysteps/blog/2018/10/31/mir-based-borrowck-is-almost-here/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2018/10/31/mir-based-borrowck-is-almost-here/</id><published>2018-10-31T00:00:00+00:00</published><updated>2018-10-31T00:00:00+00:00</updated><content type="html"><![CDATA[<p>Now that <a href="https://blog.rust-lang.org/2018/10/30/help-test-rust-2018.html">the final Rust 2018 Release Candidate has
shipped</a>, I thought it would be a good idea to do another
update on the state of the MIR-based borrow check (aka NLL). The <a href="https://smallcultfollowing.com/babysteps/
/blog/2018/06/15/mir-based-borrow-check-nll-status-update/">last
update</a> was in June, when we were still hard at work on getting
things to work.</p>
<h2 id="rust-2018-will-use-nll-now">Rust 2018 will use NLL now</h2>
<p>Let&rsquo;s get the highlights out of the way. Most importantly, <strong>Rust 2018
crates will use NLL by default</strong>. Once the Rust 2018 release candidate
becomes stable, <strong>we plan to switch Rust 2015 crates to use NLL as
well</strong>, but we&rsquo;re holding off until we have some more experience with
people using it in the wild.</p>
<h2 id="nll-is-awesome">NLL is awesome</h2>
<p>I&rsquo;ve been using NLL in practice for quite some time now, and I can&rsquo;t
imagine going back. Recently I&rsquo;ve been working in my spare time on
<a href="https://github.com/salsa-rs/salsa">the salsa crate</a><sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>, which uses Rust 2018, and I&rsquo;ve really
noticed how NLL makes a lot of &ldquo;complex&rdquo; borrowing interactions work
out quite smoothly. These are all instances of the <a href="https://smallcultfollowing.com/babysteps/
/blog/2016/04/27/non-lexical-lifetimes-introduction/">problem cases #1
and #2</a> I highlighted way back when<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>, but they interact in
interesting ways I did not fully anticipate.</p>
<p>Let me give you a hypothetical example. Imagine I am writing some bit
of code that routes messages, which look like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">enum</span> <span class="nc">Message</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Letter</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">recipient</span>: <span class="nb">String</span><span class="p">,</span><span class="w"> </span><span class="n">data</span>: <span class="nb">String</span> <span class="p">},</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// ... maybe other cases here ...
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>When I receive a letter, I want to inspect its recipient. If that matches my name,
I will process the data using <code>process</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">process</span><span class="p">(</span><span class="n">data</span>: <span class="kp">&amp;</span><span class="kt">str</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>but otherwise I&rsquo;ll forward it along to the next person in the
chain. Using NLL, I can write this code like so (<a href="https://play.rust-lang.org/?version=nightly&amp;mode=debug&amp;edition=2018&amp;gist=b8dfafd14113f2933c1b5127c861df44">playground</a>):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">router</span><span class="p">(</span><span class="n">me</span>: <span class="kp">&amp;</span><span class="kt">str</span><span class="p">,</span><span class="w"> </span><span class="n">rx</span>: <span class="nc">Receiver</span><span class="o">&lt;</span><span class="n">Message</span><span class="o">&gt;</span><span class="p">,</span><span class="w"> </span><span class="n">tx</span>: <span class="nc">Sender</span><span class="o">&lt;</span><span class="n">Message</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">for</span><span class="w"> </span><span class="n">message</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="n">rx</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">match</span><span class="w"> </span><span class="o">&amp;</span><span class="n">message</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="n">Message</span>::<span class="n">Letter</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">recipient</span><span class="p">,</span><span class="w"> </span><span class="n">data</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">if</span><span class="w"> </span><span class="n">recipient</span><span class="w"> </span><span class="o">!=</span><span class="w"> </span><span class="n">me</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="n">tx</span><span class="p">.</span><span class="n">send</span><span class="p">(</span><span class="n">message</span><span class="p">).</span><span class="n">unwrap</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="n">process</span><span class="p">(</span><span class="n">data</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="c1">// ... maybe other cases here ...
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>What&rsquo;s interesting about this code is how uninteresting it is &ndash; it
basically just does what you expect, and didn&rsquo;t require any special
action to please the borrow checker<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>. But the borrowing
patterns are actually sort of complex: it starts as we enter the match
(<code>match &amp;message</code>) and continues into the match arm. On the <code>else</code>
branch of the match, the borrow is still in use (in the form of the
<code>data</code> variable), but in the <code>if</code> branch, it is not (and hence we can
call <code>tx.send(message)</code> and move the message). Before NLL, this would
have required some significant contortions to achieve (<a href="https://play.rust-lang.org/?version=nightly&amp;mode=debug&amp;edition=2015&amp;gist=ee86bacf163aab324692f0297fc05eee">try it
yourself if you
like</a>
&ndash; that&rsquo;s a link to the same code, but with Rust 2015 edition set).</p>
<h3 id="diagnostics-migration-and-performance">Diagnostics, migration, and performance</h3>
<p>We&rsquo;ve also put a lot of effort into NLL diagnostics and I think that
by and large they are even better than the old borrow checker (which
were already quite good). This is particularly true for the &rsquo;lifetime
error messages&rsquo;.  Unfortunately, you won&rsquo;t see <em>all</em> of those
improvements yet on Rust 2018 &ndash; the reason has to do with
<strong>migration</strong>.</p>
<p>What is this migration you ask? Well, it&rsquo;s our way of dealing with the
fact that the new MIR-based borrow checker has fixed a ton of
soundness bugs from the old checker. Unfortuantely, in practice, that
means that some existing code will not compile anymore (because it
never should have compiled in the first place!). To give people time
to make that transition, we are running the NLL code in &ldquo;migration
mode&rdquo;, which means that if you have code that used to compile, but no
longer does, we issue <strong>warnings</strong> instead of <strong>errors</strong>. This
migration mode will eventually change to issue <strong>hard errors</strong> instead
(probably in a few releases, but that depends a bit on what we find in
the wild).</p>
<p>One downside of migration mode is that it requires keeping around the
older code. In some cases, this older code can produce errors that
wind up masking the newer, nicer errors that are produced by the
MIR-based checker. The good news is that once we finish the migration,
this means that errors will just get better.</p>
<p><a name="what-next"></a></p>
<p>Finally, those of you who read the previous posts may remember that
compilation times when using the NLL checker was a big stumbling
block. I&rsquo;m happy to report that the performance issues were largely
addressed: there remains some slight overhead to using NLL, but it is
largely not noticeable in practice, and I expect we&rsquo;ll continue to
improve it over time.</p>
<h3 id="what-next">What next?</h3>
<p>So, now that NLL is shipping, what is next for ownership and borrowing
in Rust? That&rsquo;s a big question, and it has a few different answers,
depending on the &ldquo;scale&rdquo; of time we are looking at. The <strong>immediate
answer</strong> is that we&rsquo;ve still got some bugs to nail down (small ones)
and of course we expect that once more people start banging on the new
code, they&rsquo;ll encounter new problems that have to be fixed. In
addition, we&rsquo;ve got to put some energy into writing up documentation
for how the new checker works and similar things (we wound up
deviating from the RFC analysis in various ways, and it&rsquo;d be nice to
document those).</p>
<p>In the <strong>medium term</strong>, the plan is to push more on the <a href="https://github.com/rust-lang-nursery/polonius/">Polonius</a>
formulation of NLL that <a href="https://smallcultfollowing.com/babysteps/
/blog/2018/04/27/an-alias-based-formulation-of-the-borrow-checker/">I described here</a>. In addition
to offering a crisp formalization of our analysis, Polonius promises
to fix the <a href="https://smallcultfollowing.com/babysteps/
/blog/2016/04/27/non-lexical-lifetimes-introduction/#problem-case-3-conditional-control-flow-across-functions">Problem Case #3</a> that I identified in the
original NLL introduction, along with some other cases where the
current analysis falls short.</p>
<p>In the <strong>longer term</strong>, well, that&rsquo;s an open question, and one where I
would like to hear from you, dear reader. Over the next week or so, I
am planning to write up a series of blog posts. Each will describe
what I consider to be a common &ldquo;tricky scenario&rdquo; where people hit
problems with the borrow checker, and none of which are solved by NLL.
I&rsquo;ll also describe the current fixes required. Then I hope to do a
survey, trying to get a picture of which of these challenges cause the
most problems for folks, so that we can try to decide how to
prioritize future improvements to Rust.</p>
<h3 id="footnotes">Footnotes</h3>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Did you see how smoothly I worked in that plug for <a href="https://github.com/salsa-rs/salsa">salsa</a>? I&rsquo;ll write a post about it soon, I promise.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Note that the current NLL implementation does not solve Problem Case #3. See [the &ldquo;What Next?&rdquo; section][wn] for more.
[wn]: #what-next&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Interestingly, I remember an example almost exactly like this being shown to me by a Servo intern &ndash; I forget which one &ndash; many years ago. At the time, it didn&rsquo;t seem like a big deal to do the workarounds, but I realize now I was wrong about that. Ah well.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/nll" term="nll" label="NLL"/></entry><entry><title type="html">October Office Hour Slots</title><link href="https://smallcultfollowing.com/babysteps/blog/2018/09/27/october-office-hour-slots/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2018/09/27/october-office-hour-slots/</id><published>2018-09-27T00:00:00+00:00</published><updated>2018-09-27T00:00:00+00:00</updated><content type="html"><![CDATA[<p>Just a quick note that <a href="https://github.com/nikomatsakis/office-hours/blob/master/2018/10.md">the October 2018 office hour slots</a> are
now posted. If you&rsquo;re having a problem with Rust, or have something
you&rsquo;d like to talk out, please sign up!</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/officehours" term="officehours" label="OfficeHours"/></entry><entry><title type="html">Office Hours #1: Cyclic services</title><link href="https://smallcultfollowing.com/babysteps/blog/2018/09/24/office-hours-1-cyclic-services/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2018/09/24/office-hours-1-cyclic-services/</id><published>2018-09-24T00:00:00+00:00</published><updated>2018-09-24T00:00:00+00:00</updated><content type="html"><![CDATA[<p>This is a report on the second <a href="https://github.com/nikomatsakis/office-hours">&ldquo;office hours&rdquo;</a>, in which we
discussed how to setup a series of services or actors that communicate
with one another. This is a classic kind of problem in Rust: how to
deal with cyclic data. Usually, the answer is that the cycle is not
necessary (as in this case).</p>
<h3 id="the-setup">The setup</h3>
<p>To start, let&rsquo;s imagine that we were working in a GC&rsquo;d language, like
JavaScript. We want to have various &ldquo;services&rdquo;, each represented by an
object. These services may need to communicate with one another, so we
also create a <strong>directory</strong>, which stores pointers to all the
services. As each service is created, they add themselves to the
directory; when it&rsquo;s all setup, each service can access all other
services. The setup might look something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-js" data-lang="js"><span class="line"><span class="cl"><span class="kd">function</span> <span class="nx">setup</span><span class="p">()</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">  <span class="kd">var</span> <span class="nx">directory</span> <span class="o">=</span> <span class="p">{};</span>
</span></span><span class="line"><span class="cl">  <span class="kd">var</span> <span class="nx">service1</span> <span class="o">=</span> <span class="k">new</span> <span class="nx">Service1</span><span class="p">(</span><span class="nx">directory</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">  <span class="kd">var</span> <span class="nx">service2</span> <span class="o">=</span> <span class="k">new</span> <span class="nx">Service2</span><span class="p">(</span><span class="nx">directory</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">  <span class="k">return</span> <span class="nx">directory</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">function</span> <span class="nx">Service1</span><span class="p">(</span><span class="nx">directory</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">  <span class="k">this</span><span class="p">.</span><span class="nx">directory</span> <span class="o">=</span> <span class="nx">directory</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">  <span class="nx">directory</span><span class="p">.</span><span class="nx">service1</span> <span class="o">=</span> <span class="nx">self</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">  <span class="p">...</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="kd">function</span> <span class="nx">Service2</span><span class="p">(</span><span class="nx">directory</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">  <span class="k">this</span><span class="p">.</span><span class="nx">directory</span> <span class="o">=</span> <span class="nx">directory</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">  <span class="nx">directory</span><span class="p">.</span><span class="nx">service2</span> <span class="o">=</span> <span class="nx">self</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">  <span class="p">...</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></div><h3 id="transliterating-the-setup-to-rust-directly">&ldquo;Transliterating&rdquo; the setup to Rust directly</h3>
<p>If you try to translate this to Rust, you will run into a big mess.
For one thing, Rust really prefers for you to have all the pieces of
your data structure ready when you create it, but in this case when we
make the directory, the services don&rsquo;t exist. So we&rsquo;d have to make the
struct use <code>Option</code>, sort of like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Directory</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">service1</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="n">Service1</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">service2</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="n">Service2</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This is annoying though because, once the directory is initialized, these
fields will never be <code>None</code>.</p>
<p>And of course there is a deeper problem: who is the &ldquo;owner&rdquo; in this
cyclic setup? How are we going to manage the memory? With a GC, there
is no firm answer to this question: the entire cycle will be collected
at the end, but until then each service keeps every other service
alive.</p>
<p>You <em>could</em> setup something with <a href="https://doc.rust-lang.org/std/sync/struct.Arc.html"><code>Arc</code></a> (atomic reference counting)
in Rust that has a similar flavor. For example, the directory might
have an <a href="https://doc.rust-lang.org/std/sync/struct.Arc.html"><code>Arc</code></a> to each service and the services might have weak refs
back to the directory. But <a href="https://doc.rust-lang.org/std/sync/struct.Arc.html"><code>Arc</code></a> really works best when the data is
immutable, and we want services to have state. We could solve <em>that</em>
with <a href="https://doc.rust-lang.org/std/sync/atomic/struct.AtomicU32.html">atomics</a> and/or <a href="https://doc.rust-lang.org/std/sync/struct.RwLock.html">locks</a>, but at this point we might want to step
back and see if there is a better way. Turns out, there is!</p>
<h3 id="translating-the-setup-to-rust-without-cycles">Translating the setup to Rust without cycles</h3>
<p>Our base assumption was that each service in the system needed access
to one another, since they will be communicating. But is that really
true? These services are actually going to be running on different
threads: all they really need to be able to do is to <strong>send each other
messages</strong>. In particular, they don&rsquo;t need access to the private bits
of state that belong to each service.</p>
<p>In other words, we could rework out directory so that &ndash; instead of
having a handle to each <strong>service</strong> &ndash; it only has a handle to a
<strong>mailbox</strong> for each service. It might look something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cp">#[derive(Clone)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Directory</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">service1</span>: <span class="nc">Sender</span><span class="o">&lt;</span><span class="n">Message1</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">service2</span>: <span class="nc">Sender</span><span class="o">&lt;</span><span class="n">Message2</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="sd">/// Whatever kind of message service1 expects.
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Message1</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="sd">/// Whatever kind of message service2 expects.
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Message2</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>What is this <a href="https://doc.rust-lang.org/std/sync/mpsc/struct.Sender.html"><code>Sender</code></a> type? It is part of the channels that ship in
Rust&rsquo;s standard library. The idea of a channel is that when you create
it, you get back two &ldquo;entangled&rdquo; values: a <a href="https://doc.rust-lang.org/std/sync/mpsc/struct.Sender.html"><code>Sender</code></a> and a <a href="https://doc.rust-lang.org/std/sync/mpsc/struct.Receiver.html"><code>Receiver</code></a>. You
send values on the sender and then you read them from the receiver;
moreover, the sender can be cloned many times (the receiver cannot).</p>
<p>The idea here is that, when you start your actor, you create a channel
to communicate with it. The actor takes the <a href="https://doc.rust-lang.org/std/sync/mpsc/struct.Receiver.html"><code>Receiver</code></a> and the
<a href="https://doc.rust-lang.org/std/sync/mpsc/struct.Sender.html"><code>Sender</code></a> goes into the directory for other servies to use.</p>
<p>Using channels, we can refactor our setup. We begin by making the
channels for each actor. Then we create the directory, once we have
all the pieces it needs. Finally, we can start the actors themselves:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">make_directory</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">use</span><span class="w"> </span><span class="n">std</span>::<span class="n">sync</span>::<span class="n">mpsc</span>::<span class="n">channel</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// Create the channels
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="p">(</span><span class="n">sender1</span><span class="p">,</span><span class="w"> </span><span class="n">receiver1</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">channel</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="p">(</span><span class="n">sender2</span><span class="p">,</span><span class="w"> </span><span class="n">receiver2</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">channel</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// Create the directory
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="n">directory</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Directory</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">service1</span>: <span class="nc">sender1</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">service2</span>: <span class="nc">sender2</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// Start the actors
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">start_service1</span><span class="p">(</span><span class="o">&amp;</span><span class="n">directory</span><span class="p">,</span><span class="w"> </span><span class="n">receiver1</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">start_service2</span><span class="p">(</span><span class="o">&amp;</span><span class="n">directory</span><span class="p">,</span><span class="w"> </span><span class="n">receiver2</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Starting a service looks kind of like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">start_service1</span><span class="p">(</span><span class="n">directory</span>: <span class="kp">&amp;</span><span class="nc">Directory</span><span class="p">,</span><span class="w"> </span><span class="n">receiver</span>: <span class="nc">Receiver</span><span class="o">&lt;</span><span class="n">Message1</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// Get a handle to the directory for ourselves.
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// Note that cloning a sender just produces a second handle
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// to the same receiver.
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">directory</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">directory</span><span class="p">.</span><span class="n">clone</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">std</span>::<span class="n">thread</span>::<span class="n">spawn</span><span class="p">(</span><span class="k">move</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// For each message received on `receiver`...
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="w"> </span><span class="n">message</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="n">receiver</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="c1">// ... process the message. Along the way,
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="c1">// we might send a message to another service:
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="k">match</span><span class="w"> </span><span class="n">directory</span><span class="p">.</span><span class="n">service2</span><span class="p">(</span><span class="n">Message2</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">})</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nb">Ok</span><span class="p">(())</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="cm">/* message successfully sent */</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nb">Err</span><span class="p">(</span><span class="n">_</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="cm">/* service2 thread has crashed or otherwise stopped */</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This example also shows off how Rust channels know when their
counterparts are valid (they use ref-counting internally to manage
this). So, for example, we can iterate over a <code>Receiver</code> to get every
incoming message: once all senders are gone, we will stop
iterating. Beware, though: in this case, the directory itself holds one of
the senders, so we need some sort of explicit message to stop the actor.</p>
<p>Similarly, when you send a message on a Rust channel, it knows if the
receiver has gone away. If so, <code>send</code> will return an <code>Err</code> value, so
you can recover (e.g., maybe by restarting the service).</p>
<h3 id="implementing-our-own-very-simple-channels">Implementing our own (very simple) channels</h3>
<p>Maybe it&rsquo;s interesting to peer &ldquo;beneath the hood&rdquo; a bit into channels.
It also gives some insight into how to generalize what we just did
into a pattern. Let&rsquo;s implement a <strong>very</strong> simple channel, one with a fixed
length of 1 and without all the error recovery business of counting
channels and so forth.</p>
<p>Note: If you&rsquo;d like to just view the code, <a href="https://play.rust-lang.org/?gist=9fc3d90b50e8af1470a0d488fb3993b9&amp;version=stable&amp;mode=debug&amp;edition=2015">click here to view the
complete example on the Rust playground</a>.</p>
<p>To start with, we need to create our <code>Sender</code> and <code>Receiver</code> types.
We see that each of them holds onto a <code>shared</code> value, which contains
the actual state (guarded by a mutex):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">use</span><span class="w"> </span><span class="n">std</span>::<span class="n">sync</span>::<span class="p">{</span><span class="n">Arc</span><span class="p">,</span><span class="w"> </span><span class="n">Condvar</span><span class="p">,</span><span class="w"> </span><span class="n">Mutex</span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Sender</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Send</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">shared</span>: <span class="nc">Arc</span><span class="o">&lt;</span><span class="n">SharedState</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Receiver</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Send</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">shared</span>: <span class="nc">Arc</span><span class="o">&lt;</span><span class="n">SharedState</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// Hidden shared state, not exposed
</span></span></span><span class="line"><span class="cl"><span class="c1">// to end-users
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">SharedState</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Send</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">value</span>: <span class="nc">Mutex</span><span class="o">&lt;</span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">condvar</span>: <span class="nc">Condvar</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>To create a channel, we make the shared state, and then give the
sender and receiver access to it:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">channel</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Send</span><span class="o">&gt;</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="p">(</span><span class="n">Sender</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">,</span><span class="w"> </span><span class="n">Receiver</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="n">shared</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Arc</span>::<span class="n">new</span><span class="p">(</span><span class="n">SharedState</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">value</span>: <span class="nc">Mutex</span>::<span class="n">new</span><span class="p">(</span><span class="nb">None</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">condvar</span>: <span class="nc">Condvar</span>::<span class="n">new</span><span class="p">(),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="n">sender</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Sender</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">shared</span>: <span class="nc">shared</span><span class="p">.</span><span class="n">clone</span><span class="p">()</span><span class="w"> </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="n">receiver</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Receiver</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">shared</span><span class="w"> </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">(</span><span class="n">sender</span><span class="p">,</span><span class="w"> </span><span class="n">receiver</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Finally, we can implement <code>send</code> on the sender. It will try to
store the value into the mutex, blocking so long as the mutex is <code>None</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Send</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Sender</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">send</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">value</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">shared_value</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">shared</span><span class="p">.</span><span class="n">value</span><span class="p">.</span><span class="n">lock</span><span class="p">().</span><span class="n">unwrap</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">loop</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="k">if</span><span class="w"> </span><span class="n">shared_value</span><span class="p">.</span><span class="n">is_none</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">*</span><span class="n">shared_value</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">value</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">shared</span><span class="p">.</span><span class="n">condvar</span><span class="p">.</span><span class="n">notify_all</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">return</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="c1">// wait until the receiver reads
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="n">shared_value</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">shared</span><span class="p">.</span><span class="n">condvar</span><span class="p">.</span><span class="n">wait</span><span class="p">(</span><span class="n">shared_value</span><span class="p">).</span><span class="n">unwrap</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Finally, we can implement <code>receive</code> on the <code>Receiver</code>. This just waits
until the <code>shared.value</code> field is <code>Some</code>, in which case it overwrites
it with <code>None</code> and returns the inner value:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Send</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Receiver</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">receive</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">T</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">shared_value</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">shared</span><span class="p">.</span><span class="n">value</span><span class="p">.</span><span class="n">lock</span><span class="p">().</span><span class="n">unwrap</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">loop</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">value</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">shared_value</span><span class="p">.</span><span class="n">take</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">shared</span><span class="p">.</span><span class="n">condvar</span><span class="p">.</span><span class="n">notify_all</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">return</span><span class="w"> </span><span class="n">value</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="c1">// wait until the sender sends
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="n">shared_value</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">shared</span><span class="p">.</span><span class="n">condvar</span><span class="p">.</span><span class="n">wait</span><span class="p">(</span><span class="n">shared_value</span><span class="p">).</span><span class="n">unwrap</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Again, <a href="https://play.rust-lang.org/?gist=9fc3d90b50e8af1470a0d488fb3993b9&amp;version=stable&amp;mode=debug&amp;edition=2015">here is a link to the complete example on the Rust playground</a>.</p>
<h3 id="dynamic-set-of-services">Dynamic set of services</h3>
<p>In our example thus far we used a static <code>Directory</code> struct with
fields. We might like to change to a more flexible setup, in which the
set of services grows and/or changes dynamically. To do that, I would
expect us to replace the directory with a <code>HashMap</code> mapping from kind
of service name to a <code>Sender</code> for that service. We might even want to
put that directory behind a mutex, so that if one service panics, we
can replace the <code>Sender</code> with a new one. But at that point we&rsquo;re
building up an entire actor infrastructure, and that&rsquo;s too much for
one post, so I&rsquo;ll stop here. =)</p>
<h3 id="generalizing-the-pattern">Generalizing the pattern</h3>
<p>So what was the general lesson here? In often happens that, when
writing in a GC&rsquo;d language, we get accustomed to lumping together all
kinds of data together, and then knowing what data we should and
should not touch. In our original JS example, all the services had a
pointer to the complete state of one another &ndash; but we expected them
to just leave messages and not to mutate the internal variables of
other services. Rust is not so trusting.</p>
<p>In Rust, it often pays to separate out the &ldquo;one big struct&rdquo; into
smaller pieces. In this case, we separated out the &ldquo;message
processing&rdquo; part of a service from the rest of the service state. Note
that when we implemented this message processing &ndash; e.g., our channel
impl &ndash; we still had to use some caution. We had to guard the data
with a lock, for example. But because we&rsquo;ve separated the rest of the
service&rsquo;s state out, we don&rsquo;t need to use locks for that, because no
other service can reach it.</p>
<p>This case had the added complication of a cycle and the associated
memory management headaches. It&rsquo;s worth pointing out that even in our
actor implementation, the cycle hasn&rsquo;t gone away. It&rsquo;s just reduced in
scope. Each service has a reference to the directory, and the
directory has a reference to the <code>Sender</code> for each service. As an example
of where you can see this, if you have your service iterate over all
the messages from its receiver (as we did):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">for</span><span class="w"> </span><span class="n">msg</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">receiver</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This loop will continue until all of the senders associated with this
<code>Receiver</code> go away. But the service itself has a reference to the
directory, and that directory contains a <code>Sender</code> for this receiver,
so this loop will never terminate &ndash; unless we explicitly
<code>break</code>. This isn&rsquo;t too big a surprise: Actor lifetimes tend to
require &ldquo;active management&rdquo;. Similar problems arise in GC systems when
you have big cycles of objects, as they can easily create leaks.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/officehours" term="officehours" label="OfficeHours"/></entry><entry><title type="html">Office Hours #0: Debugging with GDB</title><link href="https://smallcultfollowing.com/babysteps/blog/2018/09/21/office-hours-0-debugging-with-gdb/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2018/09/21/office-hours-0-debugging-with-gdb/</id><published>2018-09-21T00:00:00+00:00</published><updated>2018-09-21T00:00:00+00:00</updated><content type="html"><![CDATA[<p>This is a report on the first <a href="https://github.com/nikomatsakis/office-hours">&ldquo;office hours&rdquo;</a>, in which we
discussed debugging Rust programs with gdb. I&rsquo;m very grateful to
Ramana Venkata for suggesting the topic, and to Tom Tromey, who joined
in. (Tom has been doing a lot of the work of integrating rustc into
gdb and lldb lately.)</p>
<p>This blog post is just going to be a quick summary of the basic
workflow of using Rust with gdb on the command line. I&rsquo;m assuming you
are using Linux here, since I think otherwise you would prefer a
different debugger. There are probably also nifty graphical tools you
can use and maybe even IDE integrations, I&rsquo;m not sure.</p>
<h3 id="the-setting">The setting</h3>
<p>We specifically wanted to debug some test failures in a cargo project
(<a href="https://github.com/vramana/esprit">esprit</a>).  When running <code>cargo test</code>, some of the tests would panic,
and we wanted to track down why. This particular crate is also nightly
only.</p>
<h3 id="how-to-launch-gdb">How to launch gdb</h3>
<p>The first is to find the executable that runs the tests. This can be
done by running <code>cargo test -v</code> and looking in the output for the
final <code>Running</code> line. In this particular project (<a href="https://github.com/vramana/esprit">esprit</a>), we needed
to use nightly, so the command was something like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">&gt; cargo +nightly <span class="nb">test</span> -v
</span></span><span class="line"><span class="cl">...
</span></span><span class="line"><span class="cl">     Running <span class="sb">`</span>/home/espirit/target/debug/deps/prettier_rs-7c95ceaface142a9<span class="sb">`</span>
</span></span></code></pre></div><p>Then one can invoke gdb with that executable. Note also that you need to be running
a version of gdb that is somewhat recent in order to get good Rust
support (ideally in the 8.x series). You can test your version of gdb
by running <code>gdb -v</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">&gt; gdb -v
</span></span><span class="line"><span class="cl">GNU gdb <span class="o">(</span>GDB<span class="o">)</span> Fedora 8.1-15.fc28
</span></span><span class="line"><span class="cl">...
</span></span></code></pre></div><p>To run gdb, it is recommended that you use the <code>rust-gdb</code> wrapper,
which adds some Rust-specific pretty printers and other
configuration. This is installed by rustup, and hence it respects the
<code>+nightly</code> flag. In this case, we want to invoke it with the test
executable.  We are also going to set the environment variable
<code>RUST_TEST_THREADS</code> to <code>1</code>; this prevents the test runner from using
multiple threads, since that complicates the process of stepping
through the binary:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">&gt; <span class="nv">RUST_TEST_THREADS</span><span class="o">=</span><span class="m">1</span> rust-gdb target/debug/deps/prettier_rs-7c95ceaface142a9
</span></span></code></pre></div><h3 id="once-you-are-in-gdb">Once you are in gdb</h3>
<p>Once you are in gdb, you can run the program by typing <code>run</code> (or just
<code>r</code>). But in this case it will just run, find the test failure, and
then exit, which isn&rsquo;t exactly what we wanted: we wanted execution to
stop when the <code>panic!</code> occurs and let us inspect what&rsquo;s going on. To
do that, you will need to set a <strong>breakpoint</strong>. In this case, we want
to set it on the special function <code>rust_panic</code>, which is defined in
libstd for this exact purpose. We can do that with the <code>break</code>
command, as shown below. After setting the break, <em>then</em> we can run:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">&gt; <span class="nb">break</span> rust_panic
</span></span><span class="line"><span class="cl">Breakpoint <span class="m">1</span> at 0x55555564e273: file libstd/panicking.rs, line 525.
</span></span><span class="line"><span class="cl">&gt; run
</span></span></code></pre></div><p>Now when the panic occurs, we will trigger the breakpoint, and gdb
gives us back control. At this point, you can use the <code>bt</code> command to
get a backtrace, and the <code>up</code> command to move up and inspect the
callers&rsquo; state. You may also enjoy the <a href="https://sourceware.org/gdb/onlinedocs/gdb/TUI.html">&ldquo;TUI mode&rdquo;</a>. Anyway, I&rsquo;m
not really going to try to teach GDB here, I&rsquo;m sure there are much
better tutorials available.</p>
<p>One thing I did not know: gdb even supports the ability to use a
limited subset of Rust expressions from within the debugger, so you
can do things like <code>p foo.0</code> to access the first field of a tuple. You
can even call functions and methods, but not through traits.</p>
<h3 id="final-note-use-rr">Final note: use rr</h3>
<p>Another option that is worth emphasizing is that you can use the <a href="https://rr-project.org/"><code>rr</code>
tool</a> to get <strong>reversible debugging</strong>. <code>rr</code> basically extends gdb
but allows you to not only step and move <strong>forward</strong> through your
program, but also <strong>backward</strong>. So &ndash; for example &ndash; after we break no
<code>rust_panic</code>, we could execute backwards and see what happened that
led us there. Using <code>rr</code> is pretty straightforward and is <a href="https://github.com/mozilla/rr/wiki/Usage">explained
here</a>.  (There is also <a href="https://huonw.github.io/blog/2015/10/rreverse-debugging/">Huon&rsquo;s old blog post</a>, which
still seems fairly accurate.)  I could not, however, figure out how to
use <code>rust-gdb</code> with <code>rr replay</code>, but even just plain old gdb works ok
&ndash; I filed <a href="https://github.com/rust-lang/rust/issues/54433">#54433</a> about using <code>rust-gdb</code> and <code>rr replay</code>, so maybe
the answer is in there.</p>
<h3 id="ideas-for-the-future">Ideas for the future</h3>
<p>gdb support works pretty well. There were some rough edges we
encountered:</p>
<ul>
<li>Dumping hashmaps and btree-maps doesn&rsquo;t give useful output. It just shows their
internal representation, which you don&rsquo;t care about.</li>
<li>It&rsquo;d be nice to be able to do <code>cargo test --gdb</code> (or, even better,
<code>cargo test --rr</code>) and have it handle all the details of getting you
into the debugger.</li>
</ul>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/officehours" term="officehours" label="OfficeHours"/></entry><entry><title type="html">Rust office hours</title><link href="https://smallcultfollowing.com/babysteps/blog/2018/09/12/rust-office-hours/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2018/09/12/rust-office-hours/</id><published>2018-09-12T00:00:00+00:00</published><updated>2018-09-12T00:00:00+00:00</updated><content type="html"><![CDATA[<p>Hello, all! Beginning this Friday (in two days)<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>, I&rsquo;m going
to start an experiment that I call <strong>Rust office hours</strong>. The idea is
simple: I&rsquo;ve set aside a few slots per week to help people work
through problems they are having learning or using Rust. My goal here
is both to be of service but also to gain more insight into the kinds
of things people have trouble with. <strong>No problem is too big or too
small!</strong><sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></p>
<p>To start, I&rsquo;m running this through my <a href="https://github.com/nikomatsakis/office-hours"><code>office-hours</code> GitHub
repository</a>.  All you have to do to sign up for a slot is to open
a pull request adding your name; I will try to resolve things on a
first come, first serve basis.</p>
<p>I&rsquo;m starting small: I&rsquo;ve reserved two 30 minute slots per week for the
rest of September. One of those slots is reserved for beginner folks,
the other is for anybody. If this is a success, I&rsquo;ll extend to October
and beyond, and possibly add more slots.</p>
<p>So please, come check out <a href="https://github.com/nikomatsakis/office-hours">the <code>office-hours</code> repository</a>!</p>
<h3 id="footnotes">Footnotes</h3>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Uh, I meant to post this blog post earlier. But I forgot.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>OK, some problems may be too big. I&rsquo;m not <em>that</em> clever, and it&rsquo;s only a 30 minute slot.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/officehours" term="officehours" label="OfficeHours"/></entry><entry><title type="html">Rust pattern: Iterating an over a Rc&lt;Vec&lt;T>></title><link href="https://smallcultfollowing.com/babysteps/blog/2018/09/02/rust-pattern-iterating-an-over-a-rc-vec-t/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2018/09/02/rust-pattern-iterating-an-over-a-rc-vec-t/</id><published>2018-09-02T00:00:00+00:00</published><updated>2018-09-02T00:00:00+00:00</updated><content type="html"><![CDATA[<p>This post examines a particular, seemingly simple problem: given
ownership of a <code>Rc&lt;Vec&lt;u32&gt;&gt;</code>, can we write a function that returns an
<code>impl Iterator&lt;Item = u32&gt;</code>? It turns out that this is a bit harder
than it might at first appear &ndash; and, as we&rsquo;ll see, for good
reason. I&rsquo;ll dig into what&rsquo;s going on, how you can fix it, and how we
might extend the language in the future to try and get past this
challenge.</p>
<h3 id="the-goal">The goal</h3>
<p>To set the scene, let&rsquo;s take a look at a rather artifical function
signature. For whatever reason, this function has to take
ownership of an <code>Rc&lt;Vec&lt;u32&gt;&gt;</code> and it wants to return an <code>impl Iterator&lt;Item = u32&gt;</code><sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> that iterates over that vector.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">iterate</span><span class="p">(</span><span class="n">data</span>: <span class="nc">Rc</span><span class="o">&lt;</span><span class="nb">Vec</span><span class="o">&lt;</span><span class="kt">u32</span><span class="o">&gt;&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">impl</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kt">u32</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="c1">// what we want to write!
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>(This post was inspired by a problem we hit in the NLL working group.
The details of that problem were different &ndash; for example, the vector
in question was not given as an argument but instead cloned from another
location &ndash; but this post uses a simplified example so as to focus on
interesting questions and not get lost in other details.)</p>
<h3 id="first-draft">First draft</h3>
<p>The first thing to notice is that our function takes ownership of a
<code>Rc&lt;Vec&lt;u32&gt;&gt;</code> &ndash; that is, a reference counted<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> vector of
integers. Presumably, this vector is reference counted because it is
shared amongst many places.</p>
<p><strong>The fact that we have ownership of a <code>Rc&lt;Vec&lt;u32&gt;&gt;</code> is precisely
what makes our problem challenging.</strong> If the function were taking a
<code>Vec&lt;u32&gt;</code>, it would be rather trivial to write: we could invoke
<a href="https://doc.rust-lang.org/std/vec/struct.Vec.html#method.into_iter"><code>data.into_iter()</code></a> and be done with it (<a href="https://play.rust-lang.org/?gist=e5474c80b2f7fa290917b1bf3f522c30&amp;version=stable&amp;mode=debug&amp;edition=2015">try it on
play</a>).</p>
<p>Alternatively, if the function took a borrowed vector of type
<code>&amp;Vec&lt;u32&gt;</code>, there would still be an easy solution. In that case, we
couldn&rsquo;t use <code>into_iter</code>, because that requires ownership of the
vector. But we could write <code>data.iter().cloned()</code> &ndash;
<a href="https://doc.rust-lang.org/std/primitive.slice.html#method.iter"><code>data.iter()</code></a> gives us back references (<code>&amp;u32</code>) and <a href="https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.cloned">the
<code>cloned()</code> adapter</a> then &ldquo;clones&rdquo; them to give us back a <code>u32</code>
(<a href="https://play.rust-lang.org/?gist=e5474c80b2f7fa290917b1bf3f522c30&amp;version=stable&amp;mode=debug&amp;edition=2015">try it on play</a>).</p>
<p>But we have a <code>Rc&lt;Vec&lt;u32&gt;&gt;</code>, so what can we do? We can&rsquo;t invoke
<a href="https://doc.rust-lang.org/std/vec/struct.Vec.html#method.into_iter"><code>into_iter</code></a>, since that requires <strong>complete</strong> ownership
of the vector, and we only have <strong>partial</strong> ownership (we share this
same vector with whoever else has an <code>Rc</code> handle). So let&rsquo;s try using
<code>.iter().cloned()</code>, like we did with the shared reference:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// First draft
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">iterate</span><span class="p">(</span><span class="n">data</span>: <span class="nc">Rc</span><span class="o">&lt;</span><span class="nb">Vec</span><span class="o">&lt;</span><span class="kt">u32</span><span class="o">&gt;&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">impl</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kt">u32</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span><span class="p">.</span><span class="n">iter</span><span class="p">().</span><span class="n">cloned</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>If you <a href="https://play.rust-lang.org/?gist=dbf25e623505ebbb9a118b9155107fbc&amp;version=stable&amp;mode=debug&amp;edition=2015">try that on playground</a>, you&rsquo;ll find you get this error:</p>
<pre tabindex="0"><code>error[E0597]: `data` would be dropped while still borrowed
 --&gt; src/main.rs:4:5
   |
 4 |     data.iter().cloned()
   |     ^^^^ borrowed value does not live long enough
 5 | }
   | - borrowed value only lives until here
   |
   = note: borrowed value must be valid for the static lifetime...
</code></pre><p>This error is one of those frustrating error messages &ndash; it says
<em>exactly</em> what the problem is, but it&rsquo;s pretty hard to understand.
(I&rsquo;ve filed <a href="https://github.com/rust-lang/rust/issues/53882">#53882</a> to improve it, though I&rsquo;m not yet sure what I
think it should say.) So let&rsquo;s dig in to what is going on.</p>
<h3 id="iter-borrows-the-collection-it-is-iterating-over">iter() borrows the collection it is iterating over</h3>
<p>Fundamentally, the problem here is that when we invoke <code>iter</code>,
it borrows the variable <code>data</code> to create a reference (of type <code>&amp;[u32]</code>).
That reference is then part of the iterator that is getting returned.
The problem is that the memory that this reference refers to is owned
by the <code>iterate</code> function, and when <code>iterate</code> returns, that memory will
be freed. Therefore, the iterator we give back to the caller will refer
to invalid memory.</p>
<p>If we kind of &lsquo;inlined&rsquo; the <code>iter</code> call a bit, what&rsquo;s going on would look like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">iterate</span><span class="p">(</span><span class="n">data</span>: <span class="nc">Rc</span><span class="o">&lt;</span><span class="nb">Vec</span><span class="o">&lt;</span><span class="kt">u32</span><span class="o">&gt;&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">impl</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kt">u32</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">iterator</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Iterator</span>::<span class="n">new</span><span class="p">(</span><span class="o">&amp;</span><span class="n">data</span><span class="p">);</span><span class="w"> </span><span class="c1">// &lt;-- call to iter() returns this
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">cloned_iterator</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ClonedIterator</span>::<span class="n">new</span><span class="p">(</span><span class="n">iterator</span><span class="p">);</span><span class="w"> </span><span class="o">&lt;--</span><span class="w"> </span><span class="n">call</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="n">cloned</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">cloned_iterator</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Here you can more clearly see that <code>data</code> is being borrowed in the
first line.</p>
<h3 id="drops-in-rust-are-deterministic">drops in Rust are deterministic</h3>
<p>Another crucial ingredient is that the local variable <code>data</code> will be
&ldquo;dropped&rdquo; when <code>iterate</code> returns. &ldquo;Dropping&rdquo; a local variable means
two things:</p>
<ul>
<li>We run the destructor, if any, on the value within.</li>
<li>We free the memory on the stack where the local variable is stored.</li>
</ul>
<p>Dropping in Rust proceeds at fixed point. <code>data</code> is a local variable,
so &ndash; unless it was moved before that point &ndash; it will be dropped when
we exit its scope. (In the case of temporary values, we use a set of
syntactic rules to decide its scope.) In this case, <code>data</code> is a
parameter to the function <code>iterate</code>, so it is going to be dropped when
<code>iterate</code> returns.</p>
<p>Another key thing to understand is that the borrow checker does not
&ldquo;control&rdquo; when drops happen &ndash; that is controlled entirely by the
syntactic structure of the code.<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup> The borrow checker then comes after
and looks to see what could go wrong if that code were executed. In
this case, it seems that we have a reference to <code>data</code> that will be
returned, but &ndash; during the lifetime of that reference &ndash; <code>data</code> will
be dropped. That is bad, so it gives an error.</p>
<h3 id="what-is-the-fundamental-problem-here">What is the fundamental problem here?</h3>
<p>This is actually a bit of a tricky problem to fix. The problem here is
that <code>Rc&lt;Vec&lt;u32&gt;&gt;</code> only has <strong>shared</strong> ownership of the <code>Vec&lt;u32&gt;</code>
within &ndash; therefore, it does not offer any API that will return you a
<code>Vec&lt;u32&gt;</code> value. You can only get back <code>&amp;Vec&lt;u32&gt;</code> values &ndash; that is,
references to the vector inside.</p>
<p><strong>Furthermore, the references you get back will never be able to
outlive the <code>Rc&lt;Vec&lt;u32&gt;&gt;</code> value they came from!</strong> That is, they will
never be able to outlive <code>data</code>. The reason for this is simple: once
<code>data</code> gets dropped, those references might be invalid.</p>
<p>So what all of this says is that we will never be able to return an
iterator over <code>data</code> unless we can somehow <strong>transfer ownership of
<code>data</code> back to our caller</strong>.</p>
<p>It is interesting to compare this example with the alternative signatures
we looked at early on:</p>
<ul>
<li>If <code>iterate</code> took a <code>Vec&lt;u32&gt;</code>, then it would have full ownership of
the vector. It can use <code>into_iter</code> to transfer that ownership into
an iterator and return the iterator. Therefore, <strong>ownership was
given back to the caller</strong>.</li>
<li>If <code>iterate</code> took a <code>&amp;Vec&lt;u32&gt;</code>, it never owned the vector to begin
with! It can use <code>iter</code> to create an iterator that references into
that vector.  We can return that iterator to the caller without
incident because <strong>the data it refers to is owned by the caller, not
us</strong>.</li>
</ul>
<h3 id="how-can-we-fix-it">How can we fix it?</h3>
<p>As we just saw, to write this function we need to find some way to
give ownership of <code>data</code> back to the caller, while still yielding up
an iterator. One way to do it is by using a <code>move</code> closure, like so
(<a href="https://play.rust-lang.org/?gist=2fc90fb310e8fac9298d7c34a67e9a21&amp;version=stable&amp;mode=debug&amp;edition=2015">playground</a>):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">iterate</span><span class="p">(</span><span class="n">data</span>: <span class="nc">Rc</span><span class="o">&lt;</span><span class="nb">Vec</span><span class="o">&lt;</span><span class="kt">u32</span><span class="o">&gt;&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">impl</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kt">u32</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">len</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">data</span><span class="p">.</span><span class="n">len</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">(</span><span class="mi">0</span><span class="o">..</span><span class="n">len</span><span class="p">).</span><span class="n">map</span><span class="p">(</span><span class="k">move</span><span class="w"> </span><span class="o">|</span><span class="n">i</span><span class="o">|</span><span class="w"> </span><span class="n">data</span><span class="p">[</span><span class="n">i</span><span class="p">])</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>So why does this work? In the first line, we just read out the length
of the <code>data</code> vector &ndash; note that, in Rust, any vector stored in a
<code>Rc</code> is also immutable (only a full owner can mutate a vector), so we
know that this length can never change. Now that we have the length
<code>len</code>, we can create an iterator <code>0..len</code> over the integers from <code>0</code>
to <code>len</code>. Then we can map from each index <code>i</code> to the data using
<code>data[i]</code> &ndash; since the data inside is just an integer, it gets copied
out.</p>
<p>In terms of ownership, the key point is that here the closure is
taking ownership of <code>data</code>. The closure is then placed into the
iterator, and the iterator is returned. <strong>So indeed ownership of the
vector <em>is</em> passing back to the caller as part of the iterator.</strong></p>
<h3 id="what-about-if-i-dont-have-integers">What about if I don&rsquo;t have integers?</h3>
<p>You could use the same trick to return an iterator of any type, but
you must be able to clone it. For example, you could iterate over
strings (<a href="https://play.rust-lang.org/?gist=ab0595b0cdbacd30a9d19493281fca52&amp;version=stable&amp;mode=debug&amp;edition=2015">playground</a>):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">iterate</span><span class="p">(</span><span class="n">data</span>: <span class="nc">Rc</span><span class="o">&lt;</span><span class="nb">Vec</span><span class="o">&lt;</span><span class="nb">String</span><span class="o">&gt;&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">impl</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">String</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">len</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">data</span><span class="p">.</span><span class="n">len</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">(</span><span class="mi">0</span><span class="o">..</span><span class="n">len</span><span class="p">).</span><span class="n">map</span><span class="p">(</span><span class="k">move</span><span class="w"> </span><span class="o">|</span><span class="n">i</span><span class="o">|</span><span class="w"> </span><span class="n">data</span><span class="p">[</span><span class="n">i</span><span class="p">].</span><span class="n">clone</span><span class="p">())</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Why is it important that we clone it? Why can&rsquo;t we return references?
This falls out from how the <code>Iterator</code> trait is designed. If you look
at the definition of iterator, it states that it <strong>gives ownership</strong>
of each item that it iterates over:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="o">&lt;</span><span class="na">&#39;s</span><span class="o">&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="na">&#39;s</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//           ^^ This would normally be written
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//           `&amp;self`, but I&#39;m giving it a name
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//           so I can refer to it below.
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>In particular, the <code>next</code> function borrows <code>self</code> <strong>only for the
duration of the call to <code>next</code></strong>. <code>Self::Item</code>, the return type, does
not mention the lifetime <code>'s</code> of the self reference, so it cannot
borrow from <code>self</code>. This means that I can write generic code where we
extract an item, drop the iterator, and then go on using the item:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">dump_first</span><span class="o">&lt;</span><span class="n">I</span><span class="o">&gt;</span><span class="p">(</span><span class="n">some_iter</span>: <span class="nc">impl</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">I</span><span class="o">&gt;</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">I</span>: <span class="nc">Debug</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Get an item from the iterator.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">some_iter</span><span class="p">.</span><span class="n">next</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Drop the iterator early.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">std</span>::<span class="n">mem</span>::<span class="nb">drop</span><span class="p">(</span><span class="n">some_iter</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Keep using the item.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;</span><span class="si">{:?}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="n">item</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now, imagine what would happen it we permitted the closure to
return <code>move |i| &amp;data[i]</code> and we then passed the resulting iterator
to <code>dump_first</code>:</p>
<ol>
<li>We would first extract a reference into <code>data</code> and store it in <code>item</code>.</li>
<li>We would then drop the iterator, which in turn would drop <code>data</code>,
potentially freeing the vector (if this is the last <code>Rc</code> handle).</li>
<li>Finally, we would then go on to use <code>item</code>, which has a reference
into the (now possibly freed) vector.</li>
</ol>
<p>So, the lesson is: <strong>if you want to return an iterator over borrowed
data, per the design of the <code>Iterator</code> trait, you must be iterating
over a borrowed reference to begin with</strong> (i.e., <code>iterate</code> would need
to take a <code>&amp;Rc&lt;Vec&lt;u32&gt;&gt;</code>, <code>&amp;Vec&lt;u32&gt;</code>, or <code>&amp;[u32]</code>).</p>
<h3 id="how-could-we-extend-the-language-to-help-here">How could we extend the language to help here?</h3>
<h4 id="self-references">Self references</h4>
<p>This is an interesting question. If we focus just on the original
problem &ndash; that is, how to return an <code>impl Iterator&lt;Item = u32&gt;</code> &ndash;
then most obvious thing is the idea of extending the lifetime system
to permit &ldquo;self-references&rdquo; &ndash; for example, it would be nice if you
could have a struct that owns some data (e.g., our <code>Rc&lt;Vec&lt;u32&gt;&gt;</code>) and
also had a reference into that data (e.g., the result of invoking
<code>iter</code>). This might allow us a nicer way of writing the solution to
our original problem (returning an <code>impl Iterator&lt;Item = u32&gt;</code>). In
particular, what we effectively did in our solution was to use an
integer as a kind of &ldquo;reference&rdquo; into the vector &ndash; each step, we
index again. Since indexing is very cheap, this is fine for iterating
over a vector, but it wouldn&rsquo;t work with (say) a <code>Rc&lt;HashMap&lt;K, V&gt;&gt;</code>.</p>
<p>My personal hope is that once we wrap up work on the MIR
borrow-checker (NLL) &ndash; and we are starting to get close! &ndash; we can
start to think about self-references and how to model them in
Rust. I&rsquo;d like to transition to <a href="https://smallcultfollowing.com/babysteps/
/blog/2018/04/27/an-alias-based-formulation-of-the-borrow-checker/">a Polonius-based system</a>
first, though.</p>
<h4 id="auxiliary-values">Auxiliary values</h4>
<p>Another possible direction that has been kicked around is having some
way for a function to return data that its caller must store, which
can then be referenced by the &ldquo;real&rdquo; return value. The idea would be
that <code>iterate</code> would somehow &ldquo;store&rdquo; the <code>Rc&lt;Vec&lt;u32&gt;&gt;</code> into its
caller&rsquo;s stack frame, and then return an iterator over
that. Ultimately, this is very similar to the &ldquo;self-reference&rdquo;
concept: the difference is that, with self-references, <code>iterate</code> has
to return one value that stores both the <code>Rc&lt;Vec&lt;u32&gt;&gt;</code> and the
iterator over it. With this &ldquo;store data in caller&rdquo; approach, <code>iterate</code>
would return just the iterator, but would specify that the iterator
borrows from this other value (the <code>Rc&lt;Vec&lt;u32&gt;&gt;</code>) which is returned
in a separate channel.</p>
<p>Interestingly, this idea of returning &ldquo;auxiliary&rdquo; values might permit
us to return an iterator that gives back references &ndash; even though I
said that was impossible, per the design of the <code>Iterator</code> trait. How
could that work? Well, the problem fundamentally is that we <em>want</em> a
signature like this, where the iterator yields up <code>&amp;T</code> references:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">iterate</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="n">data</span>: <span class="nc">Rc</span><span class="o">&lt;</span><span class="nb">Vec</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">impl</span><span class="w"> </span><span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">T</span><span class="o">&gt;</span><span class="w">
</span></span></span></code></pre></div><p>Right now, we can&rsquo;t have this signature, because we have no lifetime
to assign to the <code>&amp;T</code> type. In particular, the answer to the question
&ldquo;where are those references borrowing from?&rdquo; is that they are
borrowing from the function <code>iterate</code> itself, which won&rsquo;t work (as
we&rsquo;ve seen).</p>
<p>But if we had some &ldquo;auxiliary&rdquo; slot of data that we could fill and then reference,
we might be able to give it a lifetime &ndash; let&rsquo;s call it <code>'aux</code>. Then we could
return <code>impl Iterator&lt;Item = &amp;'aux T&gt;</code>.</p>
<p>Anyway, this is just wild, irresponsible speculation. I don&rsquo;t have
concrete ideas for how this would work<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup>. But it&rsquo;s an interesting
thought.</p>
<h3 id="discussion">Discussion</h3>
<p>I&rsquo;ve opened <a href="https://users.rust-lang.org/t/blog-post-series-rust-patterns/20080">a users
thread</a>
to discuss this blog post (along with other Rust pattern blog posts).</p>
<h3 id="footnotes">Footnotes</h3>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>This just means it wants to return &ldquo;some iterator that yields up <code>u32</code> values&rdquo;.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Also worth nothing: in Rust, reference counted data is typically immutable.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>In other words, lifetime inference doesn&rsquo;t affect execution order. This is crucial &ndash; for example, it is the reason we can move to <a href="https://rust-lang.github.io/rfcs/2094-nll.html">NLL</a> without breaking backwards compatibility.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>In terms of the underlying semantics, though, I imagine it could be a kind of sugar atop either self-references or [out pointers]. But that&rsquo;s sort of as far as I got. =)
[out pointers]: <a href="https://internals.rust-lang.org/t/thoughts-about-additional-built-in-pointer-types/959">https://internals.rust-lang.org/t/thoughts-about-additional-built-in-pointer-types/959</a>&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/rustpattern" term="rustpattern" label="RustPattern"/></entry><entry><title type="html">Never patterns, exhaustive matching, and uninhabited types (oh my!)</title><link href="https://smallcultfollowing.com/babysteps/blog/2018/08/13/never-patterns-exhaustive-matching-and-uninhabited-types-oh-my/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2018/08/13/never-patterns-exhaustive-matching-and-uninhabited-types-oh-my/</id><published>2018-08-13T00:00:00+00:00</published><updated>2018-08-13T00:00:00+00:00</updated><content type="html"><![CDATA[<p>One of the long-standing issues that we&rsquo;ve been wrestling with in Rust
is how to integrate the concept of an &ldquo;uninhabited type&rdquo; &ndash; that is, a
type which has no values at all. Uninhabited types are useful to
represent the &ldquo;result&rdquo; of some computation you know will never execute
&ndash; for example, if you have to define an error type for some
computation, but this particular computation can never fail, you might
use an uninhabited type.</p>
<p><a href="https://github.com/rust-lang/rfcs/pull/1216">RFC 1216</a> introduced <code>!</code>
as the sort of &ldquo;canonical&rdquo; uninhabited type in Rust, but actually one
can readily make an uninhabited type of your very own just by declared
an enum with no variants (e.g., <code>enum Void { }</code>). Since such an enum
can never be instantiated, the type cannot have any values. Done.</p>
<p>However, ever since the introduction of <code>!</code>, we&rsquo;ve wrestled with some
of its implications, particularly around <em>exhaustiveness checking</em> &ndash;
that is, the checks the compiler does to ensure that when you write a
<code>match</code>, you have covered every possibility. As we&rsquo;ll see a bit later,
there are some annoying tensions &ndash; particularly between the needs of
&ldquo;safe&rdquo; and &ldquo;unsafe&rdquo; code &ndash; that are tricky to resolve.</p>
<p>Recently, though, Ralf Jung and I were having a chat and we came up
with an interesting idea I wanted to write about. This idea offers a
possibility for a &ldquo;third way&rdquo; that lets us resolve some of these
tensions, I believe.</p>
<h3 id="the-idea--patterns">The idea: <code>!</code> patterns</h3>
<p>Traditionally, when one has an uninhabited type, one &ldquo;matches against
it&rdquo; by not writing any patterns at all. So, for example, consider the
<code>enum Void { }</code> case I had talked about. Today in Rust <a href="https://play.rust-lang.org/?gist=a9d9a47db5496de43ccc4b8bea225413&amp;version=stable&amp;mode=debug&amp;edition=2015">you can match
against such an enum with an empty match
statement</a>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">enum</span> <span class="nc">Void</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="p">(</span><span class="n">v</span>: <span class="nc">Void</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">match</span><span class="w"> </span><span class="n">v</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>In effect, this match serves as a kind of assertion. You are saying
&ldquo;because <code>v</code> can never be instantiated, <code>foo</code> could never actually be
called, and therefore &ndash; when I match against it &ndash; this <code>match</code> must
be dead code&rdquo;.  Since the match is dead code, you don&rsquo;t need to give
any match arms: there is nowhere for execution to flow.</p>
<p>The funny thing is that you made this assertion &ndash; that the match is
dead code &ndash; by <strong>not writing anything at all</strong>. We&rsquo;ll see later that
this can be problematic around unsafe code. The idea that Ralf and I
had was to introduce a new kind of pattern, a <code>!</code> pattern (pronounced
a &ldquo;never&rdquo; pattern). <strong>This <code>!</code> pattern matches against any enum with
no variants</strong> &ndash; it is an explicit way to talk about impossible cases.
Note that the <code>!</code> pattern <em>can</em> be used with the <code>!</code> type, but it can
also be used with other types, like <code>Void</code>.</p>
<p>Now we can consider the <code>match v { }</code> above as a kind of shorthand for
a use of the <code>!</code> pattern:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="p">(</span><span class="n">v</span>: <span class="nc">Void</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">match</span><span class="w"> </span><span class="n">v</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">!</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Note that since <code>!</code> explicitly represents an unreachable pattern, we
don&rsquo;t need to give a &ldquo;body&rdquo; to the match arm either.</p>
<p>We can use <code>!</code> to cover more complex cases as well. Consider something
like a <code>Result</code> that uses <code>Void</code> as the error case. If we want, we can
use the <code>!</code> pattern to explicitly say that the <code>Err</code> case is
impossible:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="p">(</span><span class="n">v</span>: <span class="nb">Result</span><span class="o">&lt;</span><span class="nb">String</span><span class="p">,</span><span class="w"> </span><span class="n">Void</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">match</span><span class="w"> </span><span class="n">v</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nb">Ok</span><span class="p">(</span><span class="n">s</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="o">..</span><span class="p">.,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nb">Err</span><span class="p">(</span><span class="o">!</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Same for matching a &ldquo;reference to nothing&rdquo;:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="p">(</span><span class="n">v</span>: <span class="kp">&amp;</span><span class="o">!</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">match</span><span class="w"> </span><span class="n">v</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">&amp;!</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="auto-never-transformation">Auto-never transformation</h3>
<p>As I noted initially, the Rust compiler currently accepts &ldquo;empty
match&rdquo; statements when dealing with uninhabited types. So clearly the
use of the <code>!</code> pattern cannot be mandatory &ndash; and anyway that would be
unergonomic. The idea is that before we check exhaustiveness and so
forth we have an &ldquo;auto-never&rdquo; step that automatically adds <code>!</code>
patterns into your match as needed.</p>
<p>There are two ways you can be missing cases:</p>
<ul>
<li>If you are matching against an <code>enum</code>, you might cover <em>some</em> of the enum variants
but not all. e.g., <code>match foo { Ok(_) =&gt; ... }</code>  is missing the <code>Err</code> case.</li>
<li>If you are matching against other kinds of values, you might be missing an arm
altogether. This occurs most often with an empty match like <code>match v { }</code>.</li>
</ul>
<p>The idea is that &ndash; when you omit a case &ndash; the compiler will attempt
to insert <code>!</code> patterns to cover that case. In effect, to try and prove
on your behalf that this case is impossible. If that fails, you&rsquo;ll get
an error.</p>
<p>The auto-never rules that I would initially propose are as
follows. The idea is that we define the auto-never rules based on the
<em>type</em> that is being matched:</p>
<ul>
<li>When matching a tuple of struct (a &ldquo;product type&rdquo;), we will &ldquo;auto-never&rdquo;
<em>all</em> of the fields.
<ul>
<li>So e.g. if matching a <code>(!, !)</code> tuple, we would auto-never a <code>(!, !</code>) pattern.</li>
<li>But if matching a <code>(u32, !)</code> tuple, auto-never would fail. You would have
to explicit write <code>(_, !)</code> as a pattern &ndash; we&rsquo;ll cover this case when we
talk about unsafe code below.</li>
</ul>
</li>
<li>When matching a reference is uninhabited, we will generate a <code>&amp;</code> pattern
and auto-never the referent.
<ul>
<li>So e.g. if matching a <code>&amp;!</code>, we would generate a <code>&amp;!</code> pattern.</li>
<li><strong>But</strong> there will be a lint for this case that fires &ldquo;around unsafe code&rdquo;,
as we discuss below.</li>
</ul>
</li>
<li>When matching an enum, then the &ldquo;auto-never&rdquo; would add all missing variants
to that enum and then recursively auto-never those variants&rsquo; arguments.
<ul>
<li>e.g., if you write <code>match x { None =&gt; .. .}</code> where <code>x: Option&lt;T&gt;</code>, then we will attempt to insert <code>Some(P)</code> where the
pattern <code>P</code> is the result of &ldquo;auto-nevering&rdquo; the type <code>T</code>.</li>
</ul>
</li>
</ul>
<p>Note that these rules compose. So for example if you are matching a
value of type <code>&amp;(&amp;!, &amp;&amp;Void)</code>, we would &ldquo;auto-never&rdquo; a pattern like
<code>&amp;(&amp;!, &amp;&amp;!)</code>.</p>
<h3 id="implications-for-safe-code">Implications for safe code</h3>
<p>One of the main use cases for uninhabited types like <code>!</code> is to be able
to write generic code that works with <code>Result</code> but have that <code>Result</code>
be optimized away when errors are impossible. So the generic code
might have a <code>Result&lt;String, E&gt;</code>, but when <code>E</code> happens to be <code>!</code>, that
is represented in memory the same as <code>String</code> &ndash; <em>and</em> the compiler
can see that anything working with <code>Err</code> variants must be dead-code.</p>
<p>Similarly, when you get a result from such a generic function and you
know that <code>E</code> is <code>!</code>, you should be able to painlessly &lsquo;unwrap&rsquo; the
result.  So if I have a value <code>result</code> of type <code>Result&lt;String, !&gt;</code>, I
would like to be able to use a <code>let</code> to extract the <code>String</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">result</span>: <span class="nb">Result</span><span class="o">&lt;</span><span class="nb">String</span><span class="p">,</span><span class="w"> </span><span class="o">!&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">..</span><span class="p">.;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="nb">Ok</span><span class="p">(</span><span class="n">value</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">result</span><span class="p">;</span><span class="w">
</span></span></span></code></pre></div><p>and extract the <code>Ok</code> value <code>v</code>. Similarly, I might like to extract
a reference to the inner value as well, doing something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">result</span>: <span class="nb">Result</span><span class="o">&lt;</span><span class="nb">String</span><span class="p">,</span><span class="w"> </span><span class="o">!&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">..</span><span class="p">.;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="nb">Ok</span><span class="p">(</span><span class="n">value</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">result</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// Here, `value: &amp;String`.
</span></span></span></code></pre></div><p>or &ndash; equivalently &ndash; by using the <code>as_ref</code> method</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">result</span>: <span class="nb">Result</span><span class="o">&lt;</span><span class="nb">String</span><span class="p">,</span><span class="w"> </span><span class="o">!&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">..</span><span class="p">.;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="nb">Ok</span><span class="p">(</span><span class="n">value</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">result</span><span class="p">.</span><span class="n">as_ref</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// Here, `value: &amp;String`.
</span></span></span></code></pre></div><p>All of these cases should work out just fine under this proposal. The
auto-never transformation would effectively add <code>Err(!)</code> or <code>Err(&amp;!)</code>
patterns &ndash; so the final example would be equivalent to:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">value</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">match</span><span class="w"> </span><span class="n">result</span><span class="p">.</span><span class="n">as_ref</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nb">Ok</span><span class="p">(</span><span class="n">v</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">v</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nb">Err</span><span class="p">(</span><span class="o">&amp;!</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">};</span><span class="w">
</span></span></span></code></pre></div><h3 id="unsafe-code-and-access-based-models">Unsafe code and access-based models</h3>
<p>Around safe code, the idea of <code>!</code> patterns and auto-never don&rsquo;t seem
that useful: it&rsquo;s maybe just an interesting way to make it a bit more
explicit what is happening. Where they really start to shine, however,
is when you start thinking carefully about <em>unsafe</em> code &ndash; and in
particular when we think about how matches interact with access-based
models of undefined behavior.</p>
<h4 id="what-data-does-a-match-access">What data does a match &ldquo;access&rdquo;?</h4>
<p>While the details of our model around unsafe code are still being
worked out (in part by this post!), there is a general consensus that
we want an &ldquo;access-based&rdquo; model. For more background on this, see
Ralf&rsquo;s lovely recent blog post on <a href="https://www.ralfj.de/blog/2018/08/07/stacked-borrows.html">Stacked Borrows</a>, and in
particular the first section of it. In general, in an access-based
model, the user asserts that data is valid by accessing it &ndash; and in
particular, they need not access <strong>all</strong> of it.</p>
<p>So how do access-based models relate to matches? The Rust match is a
very powerful construct that can do a lot of things! For example, it
can extract fields from structs and tuples:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="mi">22</span><span class="p">,</span><span class="w"> </span><span class="mi">44</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">match</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">(</span><span class="n">v</span><span class="p">,</span><span class="w"> </span><span class="n">_</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="o">..</span><span class="p">.,</span><span class="w"> </span><span class="c1">// reads the `x.0` field
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">(</span><span class="n">_</span><span class="p">,</span><span class="w"> </span><span class="n">w</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="o">..</span><span class="p">.,</span><span class="w"> </span><span class="c1">// reads the `x.1` field
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>It can test which enum variant you have:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="mi">22</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">match</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nb">Some</span><span class="p">(</span><span class="n">_</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="o">..</span><span class="p">.,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nb">None</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="o">..</span><span class="p">.,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>And it can dereference a reference and read the data
that it points at:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="mi">22</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">match</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="o">&amp;</span><span class="n">w</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="o">..</span><span class="p">.,</span><span class="w"> </span><span class="c1">// Equivalent to `let w = *x;`
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p><strong>So how do we decide which data a match looks at?</strong> The idea is that
you should be able to figure that out by looking at the patterns in
the match arms and seeing what data they touch:</p>
<ul>
<li>If you have a pattern with an enum variant like <code>Some(_)</code>, then it
must access the discriminant of the enum being matched.</li>
<li>If you have a <code>&amp;</code>-pattern, then it must dereference the reference
being matched.</li>
<li>If you have a binding, then it must copy out the data that is
bound (e.g., the <code>v</code> in <code>(v, _)</code>).</li>
</ul>
<p>This seems obvious enough. But what about when dealing with an
uninhabited type? If I have <code>match x { }</code>, there are no arms at all,
so what data does <em>that</em> access?</p>
<p>The key here is to think about the matches <strong>after</strong> the auto-never
transformation has been done. In that case, we will never have an
&ldquo;empty match&rdquo;, but rather a <code>!</code> pattern &ndash; possibly wrapped in some
other patterns.  Just like any other enum pattern, this <code>!</code> pattern is
logically a kind of &ldquo;discriminant read&rdquo; &ndash; but in this case we are
reading from a discriminant that cannot exist (and hence we can
conclude the code is dead).</p>
<p>So, for example, we had a &ldquo;reference-to-never&rdquo; situation, like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">x</span>: <span class="kp">&amp;</span><span class="o">!</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">..</span><span class="p">.;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">match</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>then this would be desugared into</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">x</span>: <span class="kp">&amp;</span><span class="o">!</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">..</span><span class="p">.;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">match</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">&amp;!</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Looking at this elaborated form, the presence of the <code>&amp;</code> pattern makes
it clear that the match will access <code>*x</code>, and hence that the reference
<code>x</code> must be valid (or else we have UB) &ndash; and since no valid reference
to <code>!</code> can exist, we can conclude that this match is dead code.</p>
<h3 id="devil-is-in-the-details">Devil is in the details</h3>
<p>Now that we&rsquo;ve introduced the idea of unsafe code and so forth, there
are two particular interactions between the auto-never rules and unsafe
code that I want to revisit:</p>
<ul>
<li><strong>Uninitialized memory</strong>, which explains why &ndash; when we auto-never a tuple type &ndash;
we require <em>all</em> fields of the tuple to have uninhabited type, instead
of just one.</li>
<li><strong>References</strong>, which require some special care. In the auto-never
rules as I proposed them earlier, we used a lint to try and thread
the needle here.</li>
</ul>
<h4 id="auto-never-of-tuple-types-and-uninitialized-memory">Auto-never of tuple types and uninitialized memory</h4>
<p>In the auto-never rules, I wrote the following:</p>
<blockquote>
<ul>
<li>When matching a tuple of struct (a &ldquo;product type&rdquo;), we will &ldquo;auto-never&rdquo;
<em>all</em> of the fields.
<ul>
<li>So e.g. if matching a <code>(!, !)</code> tuple, we would auto-never a <code>(!, !</code>) pattern.</li>
<li>But if matching a <code>(u32, !)</code> tuple, auto-never would fail. You would have
to explicit write <code>(_, !)</code> as a pattern &ndash; we&rsquo;ll cover this case when we
talk about unsafe code below.</li>
</ul>
</li>
</ul>
</blockquote>
<p>You might think that this is stricter than necessary. After all, you
can&rsquo;t possibly construct an instance of a tuple type like <code>(u32, !)</code>,
since you can&rsquo;t produce a <code>!</code> value for the second half. So why
require that <em>all</em> fields by uninhabited?</p>
<p>The answer is that, using unsafe code, it is possible to <em>partially</em>
initialize a value like <code>(u32, !)</code>. In other words, you could create
code that just uses the first field, and ignores the second one. In
fact, this is even quite reasonable!  To see what I mean, consider a
type like <code>Uninit</code>, which allows one to manipulate values that are
possibly uninitialized (similar to the one introduced in <a href="https://github.com/rust-lang/rfcs/pull/1892">RFC 1892</a>):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">union</span> <span class="nc">Uninit</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">value</span>: <span class="nc">T</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">uninit</span>: <span class="p">(),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Note that the contents of a <code>union</code> are generally only known to be
valid when the fields are actually accessed (in general, unions may
have fields of more than one type, and the compiler doesn&rsquo;t known
which one is the correct type at any given time &ndash; hopefully the
programmer does).</p>
<p>Now let&rsquo;s consider a function <code>foo</code> that uses <code>Uninit</code>. <code>foo</code> is
generic over some type <code>T</code>; this type gets constructed by invoking the
closure <code>op</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="n">op</span>: <span class="nc">impl</span><span class="w"> </span><span class="nb">FnOnce</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">x</span>: <span class="nc">Uninit</span><span class="o">&lt;</span><span class="p">(</span><span class="kt">u32</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="p">)</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Uninit</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">uninit</span>: <span class="p">()</span><span class="w"> </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">x</span><span class="p">.</span><span class="n">value</span><span class="p">.</span><span class="mi">0</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">22</span><span class="p">;</span><span class="w"> </span><span class="c1">// initialize first part of the tuple
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">match</span><span class="w"> </span><span class="n">x</span><span class="p">.</span><span class="n">value</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="p">(</span><span class="n">v</span><span class="p">,</span><span class="w"> </span><span class="n">_</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// access only first part of the tuple
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">x</span><span class="p">.</span><span class="n">value</span><span class="p">.</span><span class="mi">1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">op</span><span class="p">();</span><span class="w"> </span><span class="c1">// initialize the rest of the tuple
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>For some reason, in this code, we need to combine the result
of this closure (of type <code>T</code>) with a <code>u32</code>, and we need to
manipulate that <code>u32</code> before we have invoked the closure (but probably
after too). So we create an <strong>uninitialized</strong> <code>(u32, T)</code> value,
using <code>Uninit</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">x</span>: <span class="nc">Uninit</span><span class="o">&lt;</span><span class="p">(</span><span class="kt">u32</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="p">)</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Uninit</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">uninit</span>: <span class="p">()</span><span class="w"> </span><span class="p">};</span><span class="w">
</span></span></span></code></pre></div><p>Then we initialize <em>just</em> the <code>x.value.0</code> part of the tuple:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="w">    </span><span class="n">x</span><span class="p">.</span><span class="n">value</span><span class="p">.</span><span class="mi">0</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">22</span><span class="p">;</span><span class="w"> </span><span class="c1">// initialize first part of the tuple
</span></span></span></code></pre></div><p>Finally, we can use operations like <code>match</code> (or just direct
field access) to pull out parts of that tuple. In so doing, we are
careful to ignore (using <code>_</code>) the parts that are not yet initialized:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="w">    </span><span class="k">match</span><span class="w"> </span><span class="n">x</span><span class="p">.</span><span class="n">value</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="p">(</span><span class="n">v</span><span class="p">,</span><span class="w"> </span><span class="n">_</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// access only first part of the tuple
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now, everything here is hunky-dory, right? Well, now what happens if I
invoke <code>foo</code> with a closure <code>op</code> that never returns? That closure
might have the return value <code>!</code> &ndash; and now <code>x</code> has the type
<code>Uninit&lt;(u32, !)&gt;</code>. This tuple <code>(u32, !)</code> is supposed to be
uninhabited, and yet here we are initializing it (well, the first
half) and accessing it (well, the first half). Is that ok?</p>
<p>In fact, when we first enabled full exhaustivness checking and so
forth, <a href="https://internals.rust-lang.org/t/recent-change-to-make-exhaustiveness-and-uninhabited-types-play-nicer-together/4602">we hit code doing <strong>exactly</strong> patterns like this</a>.
(Ony that code wasn&rsquo;t yet using a <code>union</code> like <code>Uninit</code> &ndash; it was
using <code>mem::uninitialized</code>, which creates problems of its own.)</p>
<p>In general, a goal for the auto-never rules was that they would only
apply when there is <strong>no matchable data</strong> accessable from the value.
In the case of a type like <code>(u32, !)</code>, it may be (as we have seen)
that there is usable data (the <code>u32</code>); so if we accepted <code>match x { }</code>
that would mean that one could still add a pattern like <code>(x, _)</code> which
would (a) extract data and (b) not by dead code and (c) not be
UB. Seems bad.</p>
<h4 id="reference-patterns-and-linting">Reference patterns and linting</h4>
<p>Now that we are armed with this idea of <code>!</code> and the auto-never
transformation, we can examine the problem of reference types, which
turns out to be the primary case where the needs of safe and unsafe
code come into conflict.</p>
<p>Throughout this post, I&rsquo;ve been assuming that we want to treat values
of types like <code>&amp;!</code> as effectively &ldquo;uninhabited&rdquo; &ndash; this follows from
the fact that we want <code>Result&lt;String, !&gt;</code> to be something that you can
work with ergonomically in safe code. Since a common thing to do is to
use <code>as_ref()</code> to transform a <code>&amp;Result&lt;String, !&gt;</code> into a
<code>Result&lt;&amp;String, &amp;!&gt;</code>, I think we would still want the compiler to
understand that the <code>Err</code> variant ought to be treated as <em>impossible</em>
in such a type.</p>
<p>Unfortunately, when it comes to unsafe code, there is a general desire
to treat any reference <code>&amp;T</code> &ldquo;with suspicion&rdquo;. Specifically, we don&rsquo;t
want to make the assumption that this is a reference to valid,
initialized memory <strong>unless we see an explicit dereference by the
user</strong>. This is really the heart of the &ldquo;access-based&rdquo; philosophy.</p>
<p>But that implies that a value of type <code>&amp;!</code> ought not be considered
uninhabited &ndash; it might be a reference to uninitialized memory, for
example, that is never intended to be used.</p>
<p>If we indeed permit you to treat <code>&amp;!</code> values as uninhabited, then we
are making it so that match statements can &ldquo;invisibily&rdquo; insert
dereferences for you that you might not expect. That seems worrisome.</p>
<p>Auto-never patterns gives us a way to resolve this impasse. For
example, when matching on a <code>&amp;!</code> value, we can insert the <code>&amp;!</code> pattern
automatically &ndash; but lint if that occurs in an <code>unsafe</code> function or a
function that contains an unsafe block (or perhaps a function that
manipulates raw pointers). Users can then silence the lint by writing
out a <code>&amp;!</code> pattern explicitly. Effectively, the lint would enforce the
rule that &ldquo;in and around unsafe code, you should write out <code>&amp;!</code> patterns
explicitly, but in safe code, you don&rsquo;t have to&rdquo;.</p>
<p>Alternatively, we could limit the auto-never transformation so that
<code>&amp;T</code> types do not &ldquo;auto-never&rdquo; &ndash; but that imposes an ergonomic tax on
safe code.</p>
<h3 id="conclusion">Conclusion</h3>
<p>This post describes the idea of a &ldquo;never pattern&rdquo; (written <code>!</code>) that
matches against the <code>!</code> type or any other &ldquo;empty enum&rdquo; type. It also
describes an auto-never transformation that inserts such patterns into
matches. As a result &ndash; in the desugared case, at least &ndash; we no
longer use the <strong>absence</strong> of a match arm to designate matches against
uninhabited types.</p>
<p>Explicit <code>!</code> patterns make it easier to define what data a match will
access. They also give us a way to use lints to help bridge the needs
of safe and unsafe code: we can encourage unsafe code to write
explicit <code>!</code> patterns where they might help document subtle points of
the semantics, without imposing that burden on safe code.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/></entry><entry><title type="html">Proposal for a staged RFC process</title><link href="https://smallcultfollowing.com/babysteps/blog/2018/06/20/proposal-for-a-staged-rfc-process/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2018/06/20/proposal-for-a-staged-rfc-process/</id><published>2018-06-20T00:00:00+00:00</published><updated>2018-06-20T00:00:00+00:00</updated><content type="html"><![CDATA[<p>I consider Rust&rsquo;s RFC process one of our great accomplishments, but
it&rsquo;s no secret that it has a few flaws. At its best, the RFC offers an
opportunity for collaborative design that is really exciting to be a
part of. At its worst, it can devolve into bickering without any real
motion towards consensus. If you&rsquo;ve not done so already, I strongly
recommend reading aturon&rsquo;s <a href="http://aturon.github.io/2018/05/25/listening-part-1/">excellent</a> <a href="http://aturon.github.io/2018/06/02/listening-part-2/">blog</a> <a href="http://aturon.github.io/2018/06/18/listening-part-3/">posts</a> on
this topic.</p>
<p>The RFC process has also evolved somewhat organically over time. What
began as &ldquo;just open a pull request on GitHub&rdquo; has moved into a process
with a number of formal and informal stages (described below). I think
it&rsquo;s a good time for us to take a step back and see if we can refine
those stages into something that works better for everyone.</p>
<p>This blog post describes a proposal that arose over some discussions
at the Mozilla All Hands. This proposal represents an alternate take
on the RFC process, drawing on some ideas from <a href="https://tc39.github.io/process-document/">the TC39
process</a>, but adapting them to Rust&rsquo;s needs. I&rsquo;m pretty excited
about it.</p>
<p><strong>Important:</strong> This blog post is meant to advertise a <strong>proposal</strong>
about the RFC process, not a final decision. I&rsquo;d love to get feedback
on this proposal and I expect further iteration on the details. In any
case, until the Rust 2018 Edition work is complete, we don&rsquo;t really
have the bandwidth to make a change like this. (And, indeed, most of
my personal attention remains on NLL at the moment.) If you&rsquo;d like to
discuss the ideas here, <a href="https://internals.rust-lang.org/t/blog-post-proposal-for-a-staged-rfc-process/7766">I opened an internals thread</a>.</p>
<h2 id="tldr">TL;DR</h2>
<p>The TL;DR of the proposal is as follows:</p>
<ul>
<li><strong>Explicit RFC stages.</strong> Each proposal moves through a <a href="https://docs.google.com/drawings/d/11KtHLYsqJzi2_Y3mOBz2FbXeG3verSHz-PFBuiwYIQw/edit?usp=sharing">series of
explicit stages</a>.</li>
<li><strong>Each RFC gets its own repository.</strong> These are automatically
created by a bot. This permits us to use GitHub issues and pull
requests to split up conversation. It also permits a RFC to have
multiple documents (e.g., a FAQ).</li>
<li><strong>The repository tracks the proposal from the early days until
stabilization.</strong> Right now, discussions about a particular proposal
are scattered across internals, RFC pull requests, and the Rust
issue tracker. Under this new proposal, a single repository would
serve as the home for the proposal. In the case of more complex
proposals, such as <code>impl Trait</code>, the repository could even serve as
the home multiple layered RFCs.</li>
<li><strong>Prioritization is now an explicit part of the process.</strong> The new
process includes an explicit step to move from the &ldquo;spitballing&rdquo;
stage (roughly &ldquo;Pre-RFC&rdquo; today) to the &ldquo;designing&rdquo; stage (roughly
&ldquo;RFC&rdquo; today). This step requires both a team champion, who agrees to
work on moving the proposal through implementation and towards
stabilization, and general agreement from the team. The aim here is
two-fold.  First, the teams get a chance to provide early feedback
and introduce key constraints (e.g., &ldquo;this may interact with feature
X&rdquo;). Second, it provides room for a discussion about prioritization:
there are often RFCs which are <em>good ideas</em>, but which are not a
good idea <em>right now</em>, and the current process doesn&rsquo;t give us a way
to specify that.</li>
<li><strong>There is more room for feedback on the final, implemented
design.</strong> In the new process, once implementation is complete, there
is another phase where we (a) write an explainer describing how the
feature works and (b) issue a general call for evaluation. We&rsquo;ve
done this before &ndash; such as cramertj&rsquo;s <a href="https://internals.rust-lang.org/t/help-test-impl-trait/6516">call for feedback on <code>impl Trait</code></a>,
aturon&rsquo;s call to <a href="https://internals.rust-lang.org/t/help-us-benchmark-incremental-compilation/6153">benchmark incremental
compilation</a>,
or alexcrichton&rsquo;s <a href="https://internals.rust-lang.org/t/help-stabilize-a-subset-of-macros-2-0/7252">push to stabilize some subset of procedural
macros</a>
&ndash; but each of those was an informal effort, rather than an explicit
part of the RFC process.</li>
</ul>
<h2 id="the-current-process">The current process</h2>
<p>Before diving into the new process, I want to give my view of the
<em>current</em> process by which an idea becomes a stable feature. This goes
beyond just the RFC itself. In fact, there are a number of stages,
though some of them are informal or sometimes skipped:</p>
<ul>
<li><strong>Pre-RFC (informal):</strong> Discussions take place &ndash; often on internals &ndash;
about the shape of the problem to be solved and possible proposals.</li>
<li><strong>RFC:</strong> A specific proposal is written and debated. It may be changed during
this debate as a result of points that are raised.
<ul>
<li><strong>Steady state:</strong> At some point, the discussion reaches a &ldquo;steady
state&rdquo;. This implies a kind of consensus &ndash; not necessarily a
consensus about what <strong>to do</strong>, but a consensus on the pros and
cons of the feature and the various alternatives.
<ul>
<li>Note that reaching a steady state does not imply that no new comments
are being posted. It just implies that the <strong>content</strong> of those comments
is not new.</li>
</ul>
</li>
<li><strong>Move to merge:</strong> Once the steady state is reached, the relevant team(s) can
move to <strong>merge</strong> the RFC. This begins with a bunch of checkboxes, where
each team member indicates that they agree that the RFC should be merged;
in some cases, blocking concerns are raised (and resolved) during this
process.</li>
<li><strong>FCP:</strong> Finally, once the team has assented to the merge, the RFC
enters the Final Comment Period (FCP). This means that we wait for
10 days to give time for any final arguments to arise.</li>
</ul>
</li>
<li><strong>Implementation:</strong> At this point, a tracking issue on the Rust repo
is created. This will be the new home for discussion about the
feature. We can also start writing code, which lands under a feature
gate.
<ul>
<li><strong>Refinement:</strong> Sometimes, after implementation the feature, we
find that the original design was inconsistent, in which case we
might opt to alter the spec. Such alterations are discussed on the
tracking issue &ndash; for significant changes, we will typically open a
dedicated issue and do an FCP process, just like with the original
RFC. A similar procedure happens for resolving unresolved questions.</li>
</ul>
</li>
<li><strong>Stabilization:</strong> The final step is to move to stabilize. This is
always an FCP decision, though the precise protocol varies. What I
consider Best Practice is to create a dedicated issue for the
stabilization: this issue should describe what is being stabilized,
with an emphasis on (a) what has changed since the RFC, (b) tests
that show the behavior in practice, and (c) what remains to be
stabilized. (An example of such an issue is <a href="https://github.com/rust-lang/rust/issues/48453">#48453</a>, which
proposed to stabilize the <code>?</code> in main feature.)</li>
</ul>
<h2 id="proposal-for-a-new-process">Proposal for a new process</h2>
<p>The heart of the new proposal is that each proposal should go through
a series of explicit stages, depicted graphically here (you can also
view this <a href="https://docs.google.com/drawings/d/11KtHLYsqJzi2_Y3mOBz2FbXeG3verSHz-PFBuiwYIQw/edit?usp=sharing">directly on Gooogle drawings</a>, where the
oh-so-important emojis work better):</p>
<div>
<img src="https://smallcultfollowing.com/babysteps/
/assets/2018-06-20-rfc-stages.svg" width="893" height="760"/>
</div>
<p>You&rsquo;ll notice that the stages are divided into two groups. <strong>The
stages on the left represent phases where significant work is being
done</strong>: they are given &ldquo;active&rdquo; names that end in &ldquo;ing&rdquo;, like
spitballing, designing, etc. The bullet points below describe the work
that is to be done. As will be described shortly, this work is done on
a dedicated repository, by the community at large, in conjunction with
at least one team champion.</p>
<p><strong>The stages on the right represent decision points, where the
relevant team(s) must decide whether to advance the RFC to the next
stage.</strong> The bullet points below represent the questions that the team
must answer. If the answer is Yes, then the RFC can proceed to the
next stage &ndash; note that sometimes the RFC can proceed, but unresolved
questions are added as well, to be addressed at a later stage.</p>
<h3 id="repository-per-rfc">Repository per RFC</h3>
<p>Today, the &ldquo;home&rdquo; for an RFC changes over the course of the
process. It may start in an internals thread, then move to the RFC
repo, then to a tracking issue, etc. Under the new process, we would
instead create a <strong>dedicated repository for each RFC</strong>. Once created,
the RFC would serve as the &ldquo;main home&rdquo; for the new proposal from start
to finish.</p>
<p>The repositories will live in the <code>rust-rfcs</code> organization. There will
be a convenient webpage for creating them; it will create a repo that
has an appropriate template and which is owned by the appropriate Rust
team, with the creator also having full permissions. These
repositories would naturally be subject to Rust&rsquo;s Code of Conduct and
other guidelines.</p>
<p><strong>Note that you do not have to seek approval from the team to create a
RFC repository.</strong> Just like opening a PR, creating a repository is
something that anyone can do. The expectation is that the team will be
tracking new repositories that are created (as well as those seeing a
lot of discussion) and that members of the team will get involved when
the time is right.</p>
<p>The goal here is to create the repository early &ndash; even before the RFC
text is drafted, and perhaps before there exists a specific
proposal. This allows joint authorship of RFCs and iteration in the
repository.</p>
<p>In addition to create a &ldquo;single home&rdquo; for each proposal, having a
dedicated RFC allows for a number of new patterns to emerge:</p>
<ul>
<li>One can create a <code>FAQ.md</code> that answers common questions and summarizes
points that have already reached consensus.</li>
<li>One can create an <code>explainer.md</code> that documents the feature and
explains how it works &ndash; in fact, creating such docs is mandatory
during the &ldquo;implementing&rdquo; phase of the process.</li>
<li>We can put more than one RFC into a single repository. Often, there
are complex features with inter-related (but distinct) aspects, and
this allows those different parts to move through the stabilization
process at a different pace.</li>
</ul>
<h3 id="the-main-rfc-repository">The main RFC repository</h3>
<p>The main RFC repository (named <code>rust-rfcs/rfcs</code> or something like that)<br>
would no longer contain content on its own, except possibly the final
draft of each RFC text. Instead, it would primarily serve as an index
into the other repositories, organized by stage (similar to <a href="https://github.com/tc39/proposals">the TC39
<code>proposals</code> repository</a>).</p>
<p>The purpose of this repository is to make it easy to see &ldquo;what&rsquo;s
coming&rdquo; when it comes to Rust. I also hope it can serve as a kind of
&ldquo;jumping off point&rdquo; for people contributing to Rust, whether that be
through design input, implementation work, or other things.</p>
<h3 id="team-champions-and-the-mechanics-of-moving-an-rfc-between-stages">Team champions and the mechanics of moving an RFC between stages</h3>
<p>One crucial role in the new process is that of the <strong>team
champion</strong>. The team champion is someone from the Rust team who is
working to drive this RFC to completion. Procedurally speaking, the
team champion has two main jobs. First, they will give periodic
updates to the Rust team at large of the latest developments, which
will hopefully identify conflicts or concerns early on.</p>
<p>The second job is that <strong>team champions decide when to try and move the
RFC between stages</strong>. The idea is that it is time to move between stages
when two conditions are met:</p>
<ul>
<li>The discussion on the repository has reached a &ldquo;steady state&rdquo;,
meaning that there do not seem to be new arguments or
counterarguments emerging. This sometimes also implies a general
consensus on the design, but not always: it does however imply
general agreement on the contours of the design space and the
trade-offs involved.</li>
<li>There are good answers to the questions listed for that stage.</li>
</ul>
<p>The actual mechanics of moving an RFC between stages are as
follows. First, although not strictly required, the team champion
should open an issue on the RFC repository proposing that it is time
to move between stages. This issue should contain a draft of the
report that will be given to the team at large, which should include
summary of the key points (pro and con) around the design. Think of
like a <a href="https://github.com/rust-lang/rfcs/pull/1909#issuecomment-327565150">summary comment</a> today. This issue can go through an FCP
period in the same way as today (though without the need for
checkmarks) to give people a chance to review the summary.</p>
<p>At that point, the team champion will open a PR on the <strong>main
repository</strong> (<code>rust-rfcs/rfcs</code>).  This PR itself will not have a lot
of content: it will mostly edit the index, moving the PR to a new
stage, and &ndash; where appropriate &ndash; linking to a specific revision of
the text in the RFC repository (this revision then serves as &ldquo;the
draft&rdquo; that was accepted, though of course further edits can and will
occur). It should also link to the issue where the champion proposed
moving to the next stage, so that the team can review the comments
found there.</p>
<p>The PRs that move an RFC between stages are primarily intended for the
Rust team to discuss &ndash; they are not meant to be the source of
sigificant discussion, which ought to be taking place on the
repository. If one looks at the current RFC process, they might
consist of roughly the set of comments that typically occur once FCP
is proposed. The teams should ensure that a decision (yay or nay) is
reached in a timely fashion.</p>
<p>Finding the best way for teams to govern themselves to ensure prompt
feedback remains a work in progress. The TC39 process is all based
around regular meetings, but we are hoping to achieve something more
asynchronous, in part so that we can be more friendly to people from
all time zones, and to ease language barriers. But there is still a
need to ensure that progress is made. I expect that weekly meetings will
continue to play a role here, if only to nag people.</p>
<!--

### How stages affect conversation

One of the things I am excited about in this proposal is that we can
use the explicit stage to help focus conversations. For example,
during the spitballing phase, the goal is to explore the motivation
and unearth constraints. Similarly, it often happens we come across
quandries that are hard to resolve until after we have gained more
experience using the feature -- choosing a default behavior can have
this character, for example. The staged process lets us explicitly
revisit those concerns at the right time.

However, one concern that has arisen in the TC39 process is that
stages can also make it hard to object to a feature on "global" or
"cross-cutting" grounds. For example, it may be that there are two
features which are individually acceptable but which -- taken together
-- seem to blow the language complexity budget. How do you decide
between them and when does this decision get made?

In the current proposal, I think that the answer is *most likely* at
the Proposal stage. More generally, we aim to address these sorts of
concerns of controlling scope in a few ways:

- By ensuring that features are tied to the roadmap, which should
  ensure they have solid (and timely) motivation.
- By requiring a Team Champion to advance through the process, which
  should generally ensure that there is enough interest in a proposal
  and bandwidth to see it through.
- By having frequent check-ins with teams, who are charged to care for
  cross-cutting concerns.

Overall, though, I think this is an area where we will continue
iterating -- we might want some more dedicated way of tracking the
"overall budget" for Rust as a whole.

-->
<h3 id="making-implicit-stages-explicit">Making implicit stages explicit</h3>
<p>There are two new points in the process that I want to highlight.
Both of these represents an attempt to take &ldquo;implicit&rdquo; decision points
that we used to have and make them more explicit and observable.</p>
<h4 id="the-proposal-point-and-the-change-from-spitballing-to-designing">The Proposal point and the change from Spitballing to Designing</h4>
<p>The very first stage in the RFC is going from the Spitballing phase to
the Designing phase &ndash; this is done by presenting a <strong>Proposal</strong>. One
crucial point is that <strong>there doesn&rsquo;t have to be a primary design in
order to present a proposal</strong>. It is ok to say &ldquo;here are two or three
designs that all seem to have advantages, and further design is needed
to find the best approach&rdquo; (often, that approach will be some form of
synthesis of those designs anyway).</p>
<p>The main questions to be answered at the proposal have to do with
<strong>motivation and prioritization</strong>. There are a few questions to answer:</p>
<ul>
<li>Is this a problem we want to solve?
<ul>
<li>And, specifically, is this a problem we want to solve <strong>now</strong>?</li>
</ul>
</li>
<li>Do we think we have some realistic ideas for solving it?
<ul>
<li>Are there major things that we ought to dig into?</li>
</ul>
</li>
<li>Are there cross-cutting concerns and interactions with other features?
<ul>
<li>It may be that two features which are individually quite good, but
which &ndash; taken together &ndash; blow the language complexity budget.
We should always try to judge how a new feature might affect the
language (or libraries) as a whole.</li>
<li>We may want to extend the process in other ways to make identification
of such &ldquo;cross-cutting&rdquo; or &ldquo;global&rdquo; concerns more first class.</li>
</ul>
</li>
</ul>
<p>The expectation is that all major proposals need to be connected to
the roadmap. This should help to keep us focused on the work we are
supposed to be doing. (I think it is possible for RFCs to advance that
are not connected to the roadmap, but they need to be simple
extensions that could effectively work at any time.)</p>
<p>There is another way that having an explicit Proposal step addresses
problems around prioritization. Creating a Proposal requires a Team
Champion, which implies that there is enough team bandwidth to see the
project through to the end (presuming that people don&rsquo;t become
champions for more than a few projects at a time). If we find that
there aren&rsquo;t enough champions to go around (and there aren&rsquo;t), then
this is a sign we need to grow the teams (something we&rsquo;ve been trying
hard to do).</p>
<p>The Proposal point also offers a chance for other team members to
point out constraints that may have been overlooked. These constraints
don&rsquo;t necessarily have to derail the proposal, they may just add new
points to be addressed during the Designing phase.</p>
<h4 id="the-candidate-point-and-the-evaluating-phase">The Candidate point and the Evaluating phase</h4>
<p>Another new addition to the process here is the Evaluation phase. The idea here
is that, once implementation is complete, we should do two things:</p>
<ul>
<li>Write up an explainer that describes how the feature works in terms
suitable for end users. This is a kind of &ldquo;preliminary&rdquo;
documentation for the feature.  It should explain how to enable the
feature, what it&rsquo;s good for, and give some examples of how to use
it.
<ul>
<li>For libraries, the explainer may not be needed, as the API docs serve
the same purpose.</li>
<li>We should in particular cover points where the design has changed
significantly since the &ldquo;Draft&rdquo; phase.</li>
</ul>
</li>
<li>Propose the RFC for <strong>Candidate</strong> status. If accepted, we will also
issue a general call for evaluation. This serves as a kind of
&ldquo;pre-stabilization&rdquo; notice.  It means that people should go take the
new feature for a spin, kick the tires, etc.  This will hopefully
uncover bugs, but also surprising failure modes, ergonomic hazards,
or other pitfalls with the design. If any significant problems are
found, we can correct them, update the explainer, and repeat until
we are satisfied (or until we decide the idea isn&rsquo;t going to work
out).</li>
</ul>
<p>As I noted earlier, we&rsquo;ve done this before, but always informally:</p>
<ul>
<li>cramertj&rsquo;s <a href="https://internals.rust-lang.org/t/help-test-impl-trait/6516">call for feedback on <code>impl Trait</code></a>;</li>
<li>aturon&rsquo;s call to <a href="https://internals.rust-lang.org/t/help-us-benchmark-incremental-compilation/6153">benchmark incremental compilation</a>;</li>
<li>alexcrichton&rsquo;s <a href="https://internals.rust-lang.org/t/help-stabilize-a-subset-of-macros-2-0/7252">push to stabilize some subset of procedural macros</a>.</li>
</ul>
<p>Once the evaluation phase seems to have reached a conclusion, we would
move to <strong>stabilize</strong> the feature. The explainer docs would then
become the preliminary documentation and be added to a kind of
addendum in the Rust book. The docs would be expected to integrate the
docs into the book in smoother form sometime after synchronization.</p>
<h2 id="conclusion">Conclusion</h2>
<p>As I wrote before, this is only a preliminary proposal, and I fully
expect us to make changes to it. Timing wise, I don&rsquo;t think it makes
sense to pursue this change immediately anyway: we&rsquo;ve too much going
on with the edition. But I&rsquo;m pretty excited about revamping our RFC
processes both by making stages explicit and adding explicit
repositories.</p>
<p>I have hopes that we will find ways to use explicit repositories to
drive discussions towards consensus faster. It seems that having the
ability, for example, to document &ldquo;auxiliary&rdquo; documents, such as lists
of constraints and rationale, can help to ensure that people&rsquo;s
concerns are both heard and met.</p>
<p>In general, I would also like to start trying to foster a culture of
&ldquo;joint ownership&rdquo; of in-progress RFCs. Maintaining a good RFC
repository is going to be a fair amount of work, which is a great
opportunity for people at large to pitch in. This can then serve as a
kind of &ldquo;mentoring on ramp&rdquo; getting people more involved in the lang
team. Similarly, I think that having a list of RFCs that are in the
&ldquo;implementation&rdquo; phase might be a way to help engage people who&rsquo;d like
to hack on the compiler.</p>
<h3 id="comments">Comments?</h3>
<p>Please leave comments in <a href="https://internals.rust-lang.org/t/blog-post-proposal-for-a-staged-rfc-process/7766">the internals thread for this post</a>.</p>
<h3 id="credit-where-credit-is-due">Credit where credit is due</h3>
<p>This proposal is heavily shaped by <a href="https://tc39.github.io/process-document/">the TC39 process</a>. This
particular version was largely drafted in a big group discussion with
<a href="https://twitter.com/wycats">wycats</a>, <a href="https://github.com/aturon">aturon</a>, <a href="https://twitter.com/ag_dubs/">ag_dubs</a>, <a href="https://github.com/steveklabnik/">steveklabnik</a>, <a href="https://github.com/nrc/">nrc</a>, <a href="https://twitter.com/jntrnr/">jntrnr</a>,
<a href="https://github.com/erickt/">erickt</a>, and <a href="https://github.com/oli-obk/">oli-obk</a>, though earlier proposals also involved a few
others.</p>
<h3 id="updates">Updates</h3>
<p>(I made various simplifications shortly after publishing, aiming to
keep the length of this blog post under control and remove what seemed
to be somewhat duplicated content.)</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/></entry><entry><title type="html">MIR-based borrow check (NLL) status update</title><link href="https://smallcultfollowing.com/babysteps/blog/2018/06/15/mir-based-borrow-check-nll-status-update/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2018/06/15/mir-based-borrow-check-nll-status-update/</id><published>2018-06-15T00:00:00+00:00</published><updated>2018-06-15T00:00:00+00:00</updated><content type="html"><![CDATA[<p>I&rsquo;ve been getting a lot of questions about the status of &ldquo;Non-lexical
lifetimes&rdquo; (NLL) &ndash; or, as I prefer to call it these days, the
MIR-based borrow checker &ndash; so I wanted to post a status
update.</p>
<p><strong>The single most important fact is that the MIR-based borrow check is
feature complete and available on nightly. What this means is that
the behavior of <code>#![feature(nll)]</code> is roughly what we intend to ship
for &ldquo;version 1&rdquo;, except that (a) the performance needs work and (b) we
are still improving the diagnostics.</strong> (More on those points later.)</p>
<p>The MIR-based borrow check as currently implemented represents a huge
step forward from the existing borrow checker, for two reasons.
First, it eliminates a ton of borrow check errors, resulting in a much
smoother compilation experience. Second, it has a lot less bugs. More
on this point later too.</p>
<p>You may be wondering how this all relates to the &ldquo;alias-based borrow
check&rdquo; that I outlined in <a href="https://smallcultfollowing.com/babysteps/
/blog/2018/04/27/an-alias-based-formulation-of-the-borrow-checker/">my previous post</a>, which we have since
dubbed <a href="https://github.com/rust-lang-nursery/polonius/">Polonius</a>. We have implemented that analysis and solved the
performance hurdles that it used to have, but it will still take some
effort to get it fully ready to ship. The plan is to defer that work
and ultimately ship Polonius as a second step: it will basically be a
&ldquo;MIR-based borrow check 2.0&rdquo;, offering even fewer errors.</p>
<h3 id="would-you-like-to-help">Would you like to help?</h3>
<p>If you&rsquo;d like to be involved, we&rsquo;d love to have you! The NLL working
group hangs out <a href="https://rust-lang.zulipchat.com/#narrow/stream/122657-wg-nll">on the <code>#wg-nll</code> stream in Zulip</a>. We have
weekly meetings on Tuesdays (3:30pm Eastern time) where we discuss the
priorities for the week and try to dole out tasks. If that time
doesn&rsquo;t work for you, you can of course pop in any time and
communicate asynchronously. You can also always go look for work to do
amongst <a href="https://github.com/rust-lang/rust/issues?utf8=%E2%9C%93&amp;q=is%3Aopen+label%3AWG-compiler-nll+-label%3ANLL-deferred">the list of GitHub issues</a> &ndash; probably the <a href="https://github.com/rust-lang/rust/labels/NLL-diagnostics">diagnostics
issues</a> are the best place to start.</p>
<h3 id="transition-period">Transition period</h3>
<p>As I mentioned earlier, the MIR-based borrow checker <a href="https://github.com/rust-lang/rust/labels/NLL-fixed-by-NLL">fixes a lot of
bugs</a> &ndash; this is largely a side effect of making the check operate
over the <a href="https://blog.rust-lang.org/2016/04/19/MIR.html">MIR</a>. This is great! However, as a result, we can&rsquo;t just
&ldquo;flip the switch&rdquo; and enable the MIR-based borrow checker by default,
since that would break existing crates (I don&rsquo;t really know how many
yet). The plan therefore is to have a transition period.</p>
<p>During the transition period, we will issue warnings if your program
<em>used</em> to compile with the old borrow checker but doesn&rsquo;t with the new
checker (because we fixed a bug in the borrow check). The way we do
this is to run <em>both</em> the old and the new borrow checker. If the new
checker would report an error, we first check if the old check would
<em>also</em> report an error. If so, we can issue the error as normal. If
not, we issue only a warning, since that represents a case that used
to compile but no longer does.</p>
<p>The good news is that while the MIR-based checker fixes a lot of bugs,
it also accepts a lot more code. This lessens the overall impact. That
is, there is a lot of code which ought to have gotten errors from the
old borrow check (but never did), but most of that code won&rsquo;t get any
errors at all under the new check. No harm, no foul. =)</p>
<h3 id="performance">Performance</h3>
<p>One of the main things we are working on is the performance of the
MIR-based checker, since enabling the MIR-based borrow checker
currently implies significant overhead during compilation. Take a look
at this chart, which plots rustc build times for the <a href="https://crates.io/crates/clap"><code>clap</code>
crate</a>:</p>
<p><img src="https://i.imgur.com/kyqmx4I.png" alt="clap-rs performance"></p>
<p>The black line (&ldquo;clean&rdquo;) represents the &ldquo;from scratch&rdquo; build time with
rustc today. The orange line (&ldquo;nll&rdquo;) represents &ldquo;from scratch&rdquo; build
times when NLL is enabled. (The other lines represent incremental
build times in various combinations.) You can see we&rsquo;ve come a long
way, but there is still plenty of work to do.</p>
<p>The biggest problem at this point is that we effectively have to
&ldquo;re-run&rdquo; the type check a second time on the MIR, in order to compute
all the lifetimes. This means we are doing two type-checks, and that
is expensive.  However, this second type check can be significantly
simpler than the original: most of the &ldquo;heavy lifting&rdquo; has been
done. Moreover, there are lots of opportunities to cache work between
them so that it only has to be done once. So I&rsquo;m confident we&rsquo;ll make
big strides here. (For example, I&rsquo;ve got a <a href="https://github.com/rust-lang/rust/pull/51460">PR up right now</a>
that <a href="http://perf.rust-lang.org/compare.html?start=61d88318aa66669fba061e9af529365172d63cd0&amp;end=757cd050fc1ef84d7235d6f4d9228189eed878cc&amp;stat=instructions%3Au">adds some simple memoization for a 20% win</a>, and I&rsquo;m
working on follow-ups that add much more aggressive memoization.)</p>
<p>(There is an interesting corollary to this: after the transition
period, the first type check will have no need to consider lifetimes
<em>at all</em>, which I think means we should be able to make it run quite a
bit faster as well, which should mean a shorter &ldquo;time till first
error&rdquo; and also help things like computing autocompletion information
for the RLS.)</p>
<h3 id="diagnostics">Diagnostics</h3>
<p>It&rsquo;s not enough to point out problems in the code, we also have to
explain the error in an understandable way. We&rsquo;ve put a lot of effort
into our existing borrow checker&rsquo;s error message. In some cases, the
MIR-based borrow checker actually does better here.  It has access to
more information, which means it can be more specific than the older
checker. As an example<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>, consider this error that the old borrow
checker gives:</p>
<pre tabindex="0"><code>error[E0597]: `json` does not live long enough
  --&gt; src\main.rs:38:17
   |
38 |         let v = json[&#34;data&#34;][&#34;search&#34;][&#34;edges&#34;].as_array();
   |                 ^^^^ borrowed value does not live long enough
...
52 |     }
   |     - `json` dropped here while still borrowed
...
90 | }
   | - borrowed value needs to live until here
</code></pre><p>The error isn&rsquo;t bad, but you&rsquo;ll note that while it says &ldquo;borrowed
value needs to live until here&rdquo; it doesn&rsquo;t tell you <em>why</em> the borrowed
value needs to live that long &ndash; only that it does. Compare that to the
new error you get from the same code:</p>
<pre tabindex="0"><code>error[E0597]: `json` does not live long enough
  --&gt; src\main.rs:39:17
   |
39 |         let v = json[&#34;data&#34;][&#34;search&#34;][&#34;edges&#34;].as_array();
   |                 ^^^^ borrowed value does not live long enough
...
53 |     }
   |     - borrowed value only lives until here
...
70 |             &#34;, last_cursor))
   |                ----------- borrow later used here
</code></pre><p>The new error doesn&rsquo;t tell you &ldquo;how long&rdquo; the borrow must last, it
points to a concrete use. That&rsquo;s great.</p>
<p>Other times, though, the errors from the new checker are not as good.
This is particularly true when it comes to suggestions and tips for
how to fix things. We&rsquo;ve gone through all of our internal diagnostic
tests and drawn up a <a href="https://github.com/rust-lang/rust/labels/NLL-diagnostics">list of about 37
issues</a>,
documenting each point where the checker&rsquo;s message is not as good as
the old one, and we&rsquo;re working now on drilling through this list.</p>
<h3 id="polonius">Polonius</h3>
<p>In my <a href="https://smallcultfollowing.com/babysteps/
/blog/2018/04/27/an-alias-based-formulation-of-the-borrow-checker/">previous blog post</a>, I described a new version of the
borrow check, which we have since dubbed <a href="https://github.com/rust-lang-nursery/polonius/">Polonius</a>. That analysis
further improves on the MIR-based borrow check that is in Nightly
now. The most significant improvement that Polonius brings has to do
with &ldquo;conditional returns&rdquo;.  Consider this example:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="n">vec</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="nb">Vec</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kp">&amp;</span><span class="nc">T</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="n">r</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">vec</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">if</span><span class="w"> </span><span class="n">some_condition</span><span class="p">(</span><span class="n">r</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">return</span><span class="w"> </span><span class="n">r</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// Question: can we mutate `vec` here? On Nightly,
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// you get an error, because a reference that is returned (like `r`)
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// is considered to be in scope until the end of the function,
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// even if that return only happens conditionally. Polonius can
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// accept this code.
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">vec</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="o">..</span><span class="p">.);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>In this example, <code>vec</code> is borrowed to produce <code>r</code>, and <code>r</code> is then
returned &ndash; but only <em>sometimes</em>. In the MIR borrowck on nightly, this
will give an error &ndash; when <code>r</code> is returned, the borrow is forced to
last until the end of <code>foo</code>, no matter what path we take. The Polonius
analysis is more precise, and understands that, outside of the <code>if</code>,
<code>vec</code> is no longer referenced by any live references.</p>
<p>We originally intended for NLL to accept examples like this: in <a href="https://rust-lang.github.io/rfcs/2094-nll.html">the
RFC</a>, this was called <a href="https://rust-lang.github.io/rfcs/2094-nll.html#problem-case-3-conditional-control-flow-across-functions">Problem Case #3</a>. However, we had to
remove that support because it was simply killing compilation times,
and there were also cases where it wasn&rsquo;t as precise as we wanted.  Of
course, some of you may recall that in my <a href="https://smallcultfollowing.com/babysteps/
/blog/2018/04/27/an-alias-based-formulation-of-the-borrow-checker/">previous post about
Polonius</a> I wrote:</p>
<blockquote>
<p>&hellip;the performance has a long way to go ([Polonius] is currently
slower than existing analysis).</p>
</blockquote>
<p>I&rsquo;m happy to report that this problem is basically solved. Despite the
increased precision, the Polonius analysis is now easily as fast as
the existing Nightly analysis, thanks some smarter encoding of the
rules as well as the move to use
<a href="https://github.com/frankmcsherry/blog/blob/master/posts/2018-05-19.md">datafrog</a>.
We&rsquo;ve not done detailed comparisons, but I consider this problem
essentially solved.</p>
<p>If you&rsquo;d like, you can try Polonius today using the <code>-Zpolonius</code>
switch to Nightly. However, keep in mind that this would be a
&lsquo;pre-alpha&rsquo; state: there are still some known bugs that we have not
prioritized fixing and so forth.</p>
<h3 id="conclusion">Conclusion</h3>
<p>The key take-aways here:</p>
<ul>
<li>NLL is in a &ldquo;feature complete&rdquo; state on Nightly.</li>
<li>We are doing a focused push on diagnostics and performance, primarily.</li>
<li>Even once it ships, we can expect further improvements in the
future, as we bring in the Polonius analysis.</li>
</ul>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Hat tip to steveklabnik for providing this example!&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/nll" term="nll" label="NLL"/></entry><entry><title type="html">An alias-based formulation of the borrow checker</title><link href="https://smallcultfollowing.com/babysteps/blog/2018/04/27/an-alias-based-formulation-of-the-borrow-checker/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2018/04/27/an-alias-based-formulation-of-the-borrow-checker/</id><published>2018-04-27T00:00:00+00:00</published><updated>2018-04-27T00:00:00+00:00</updated><content type="html"><![CDATA[<p>Ever since the Rust All Hands, I&rsquo;ve been experimenting with an
alternative formulation of the Rust borrow checker. The goal is to
find a formulation that overcomes some shortcomings of the current
proposal while hopefully also being faster to compute. I have
implemented a prototype for this analysis. It passes the full NLL test
suite and also handles a few cases &ndash; such as <a href="https://github.com/rust-lang/rust/issues/47680#issuecomment-363131420">#47680</a> &ndash; that the
current NLL analysis cannot handle. However, the performance has a
long way to go (it is currently slower than existing analysis). That
said, I haven&rsquo;t even begun to optimize yet, and I know I am doing some
naive and inefficient things that can definitely be done better; so I
am still optimistic we&rsquo;ll be able to make big strides there.</p>
<p>Also, it was pointed out to me that yesterday, April 26, is the sixth
&ldquo;birthday&rdquo; of the borrow check &ndash; it&rsquo;s fun to look at <a href="https://github.com/rust-lang/rust/commit/50a3dd40ae8ae6494e55d5cfc29eafdb4172af52">my commit from
that time</a>, gives a good picture of what Rust was like then.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup><sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></p>
<h3 id="end-users-dont-have-to-care">End-users don&rsquo;t have to care</h3>
<p>The first thing to note is that this proposal <strong>makes no difference
from the point of view of an end-user of Rust</strong>. That is, the borrow
checker ought to work the same as it would have under the NLL
proposal, more or less.</p>
<p>However, there are some subtle shifts in this proposal in terms of how
the compiler thinks about your program, and that could potentially
affect future language features.</p>
<h3 id="our-first-example">Our first example</h3>
<p>The analysis works on MIR, but I&rsquo;m going to explain it in terms of
simple Rust examples. Here is the first example, which I will call
example A. The example should not compile, as you can see:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">x</span>: <span class="kt">i32</span> <span class="o">=</span><span class="w"> </span><span class="mi">22</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">v</span>: <span class="nb">Vec</span><span class="o">&lt;&amp;</span><span class="kt">i32</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="n">r</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="nb">Vec</span><span class="o">&lt;&amp;</span><span class="kt">i32</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">v</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="n">p</span>: <span class="kp">&amp;</span><span class="kt">i32</span> <span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">x</span><span class="p">;</span><span class="w"> </span><span class="c1">// 1. `x` is borrowed here to create `p`
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">r</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="n">p</span><span class="p">);</span><span class="w">        </span><span class="c1">// 2. `p` is stored into `v`, but through `r`
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">x</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">           </span><span class="c1">// &lt;-- Error! can&#39;t mutate `x` while borrowed
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">take</span><span class="p">(</span><span class="n">v</span><span class="p">);</span><span class="w">          </span><span class="c1">// 3. the reference to `x` is later used here
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">take</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="n">p</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="regions-are-sets-of-loans">Regions are sets of loans</h3>
<p>The biggest shift in this new approach is that when you have a type
like <code>&amp;'a i32</code>, the meaning of <code>'a</code> changes:</p>
<ul>
<li>In the system described in the NLL RFC, <code>'a</code> &ndash; called a lifetime &ndash;
ultimately corresponded to some portion of the source program or
control-flow graph.</li>
<li>Under <em>this</em> proposal, <code>'a</code> &ndash; which I will be calling a region<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup> &ndash;
instead corresponds to a set of <strong>loans</strong> &ndash; that is, a set of
borrow expressions, like <code>&amp;x</code> or <code>&amp;mut v</code> in Example A. The idea is
that if a reference <code>r</code> has type <code>&amp;'a i32</code> then invalidating the <strong>terms
of any of the loans</strong> in <code>'a</code> would invalidate <code>r</code>.</li>
</ul>
<p>Invalidating the <strong>terms of a loan</strong> means to perform an illegal
access of the path borrowed by the loan. So for example if you have a
mutable loan like <code>r = &amp;mut v</code>, then you can only access the value <code>v</code>
through the reference <code>r</code>. Accessing <code>v</code> directly in any way &ndash; read,
write, or move &ndash; would invalidate the loan. For a shared loan like <code>p = &amp;x</code>, reading through <code>x</code> (or <code>p</code>) is allowed, but writing or
mutating <code>x</code> would invalidate the terms of the loan (and writing
through <code>p</code> is also not possible).</p>
<p>The subtyping rules for references work a bit differently now that a
region is a set of loans and not program points. Whereas with points,
you can approximate a reference by shortening the lifetime, with sets
of loans you can approximate by enlarging the set. In other words:</p>
<pre tabindex="0"><code>&#39;a ⊆ &#39;b
------------------
&amp;&#39;a u32 &lt;: &amp;&#39;b u32
</code></pre><p>In Rust syntax, <code>'a ⊆ 'b</code> corresponds to the notation <code>'a: 'b</code>, and
that is what I will use for the rest of the post. We have
traditionally called this an <em>outlives relationship</em>, but I am going
to call it a <em>subset relationship</em> instead, as befits the new meaning
of regions<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup>.</p>
<p>To gain a better intuition for the idea of regions as sets of loans, consider
this program:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[</span><span class="mi">1</span><span class="p">,</span><span class="w"> </span><span class="mi">2</span><span class="p">];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">p</span>: <span class="kp">&amp;</span><span class="na">&#39;a</span> <span class="kt">i32</span> <span class="o">=</span><span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="n">random</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="o">&amp;</span><span class="n">x</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="w"> </span><span class="c1">// Loan L0
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="o">&amp;</span><span class="n">x</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="w"> </span><span class="c1">// Loan L1
</span></span></span><span class="line"><span class="cl"><span class="p">};</span><span class="w">
</span></span></span></code></pre></div><p>Here, the region <code>'a</code> would correspond to the set <code>{L0, L1}</code>, since it
may refer to data produced by the loan L0, but it may also refer to
data from the loan L1.</p>
<h3 id="datalog">Datalog</h3>
<p>Throughout this post, I&rsquo;m going to be defining the analysis by using
<a href="https://en.wikipedia.org/wiki/Datalog">Datalog</a> rules. Datalog is &ndash; in some sense &ndash; a subset of Prolog
designed for efficient execution. It basically corresponds to rules
like this (using the syntax from the <a href="https://github.com/oracle/souffle/wiki">Souffle</a> project):</p>
<pre tabindex="0"><code>.decl cfg_edge(P:point, Q:point)
.input cfg_edge

.decl reachable(P:point, Q:point)
reachable(P, Q) :- cfg_edge(P, Q).
reachable(P, R) :- reachable(P, Q), cfg_edge(Q, R).
</code></pre><p>As you can see here, Datalog programs define relations between things;
here those relations are declared with <code>.decl</code><sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup>. Some relations
are <strong>inputs</strong>, declared with <code>.input</code>, which means that their values
are given up-front by the user (these are also called facts). In this
program, that is <code>cfg_edge</code>. Other relations, like <code>reachable</code>, are
defined via rules which synthesize new things from those facts. As in
Prolog, upper-case identifiers are variables, and whenever a variable
appears twice, it must have the same value.</p>
<p>Note that, because it is a subset, Datalog avoids a lot of Prolog&rsquo;s
more &lsquo;programming language&rsquo;-like properties. For example, Datalog
programs always terminate when executed on a finite set of facts (even
when they recurse, like the one above). Also, it is fine to use
negative reasoning in a Datalog program, as it disallows negative
cycles &ndash; there are no subtle concerns about the distinction between
&ldquo;logical not&rdquo; and &ldquo;negation as failure&rdquo;.<sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup></p>
<p>To implement these rules, I&rsquo;ve been using Frank McSherry&rsquo;s awesome
<a href="https://crates.io/crates/differential-dataflow">differential-dataflow</a> crate. This has been a pretty great
experience: once you get the hang of it, you can translate Datalog
rules in a very straightforward way, which means that I&rsquo;ve been able
to rapidly prototype new designs in just an hour or two. Moreover, the
resulting execution is quite fast (though I&rsquo;ve not measured
performance too much on the latest design).</p>
<h3 id="region-variables">Region variables</h3>
<p>Now that we&rsquo;ve described regions as sets of loans, I want you to throw
all of that away. The analysis as I&rsquo;ve defined it doesn&rsquo;t directly
manipulate those sets, at least not initially. Instead, it uses
&ldquo;region variables&rdquo; to represent all the regions in the program. I&rsquo;ll
denote these as &ldquo;numbered&rdquo; regions like <code>'0</code>, <code>'1</code>, etc.</p>
<p>If we rewrite our program then to use these abstract regions
(basically, to have a numbered region everywhere that MIR would have
one), it looks like the following:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">x</span>: <span class="kt">i32</span> <span class="o">=</span><span class="w"> </span><span class="mi">22</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">v</span>: <span class="nb">Vec</span><span class="o">&lt;&amp;</span><span class="na">&#39;</span><span class="mi">0</span><span class="w"> </span><span class="kt">i32</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="n">r</span>: <span class="kp">&amp;</span><span class="na">&#39;</span><span class="mi">1</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="nb">Vec</span><span class="o">&lt;&amp;</span><span class="na">&#39;</span><span class="mi">2</span><span class="w"> </span><span class="kt">i32</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;</span><span class="mi">3</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">v</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="n">p</span>: <span class="kp">&amp;</span><span class="na">&#39;</span><span class="mi">5</span><span class="w"> </span><span class="kt">i32</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;</span><span class="mi">4</span><span class="w"> </span><span class="n">x</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">r</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="n">p</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">x</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">take</span>::<span class="o">&lt;</span><span class="nb">Vec</span><span class="o">&lt;&amp;</span><span class="na">&#39;</span><span class="mi">6</span><span class="w"> </span><span class="kt">i32</span><span class="o">&gt;&gt;</span><span class="p">(</span><span class="n">v</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">take</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="n">p</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>These abstract regions will appear through our datalog rules; I&rsquo;ll
denote them with <code>R</code> for &ldquo;region&rdquo;.</p>
<h3 id="relations-between-regions">Relations between regions</h3>
<p>The abstract regions we saw before don&rsquo;t have any meaning just
yet. What happens next is that we walk through and apply the type
system rules in the standard way. This will result in &ldquo;subset&rdquo;
relationships between regions, as we saw before. So for example
consider the following line from Example A:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">p</span>: <span class="kp">&amp;</span><span class="na">&#39;</span><span class="mi">5</span><span class="w"> </span><span class="kt">i32</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;</span><span class="mi">4</span><span class="w"> </span><span class="n">x</span><span class="p">;</span><span class="w">
</span></span></span></code></pre></div><p>Here, the expression <code>&amp;'4 x</code> produces a value of type <code>&amp;'4 i32</code>. This
type must be a subtype of the type of <code>p</code>, <code>&amp;'5 i32</code>, so we get:</p>
<pre><code>&amp;'4 i32 &lt;: &amp;'5 i32
</code></pre>
<p>which in turn requires <code>'4: '5</code>. If we look at the program, we&rsquo;ll see
a number of subtype relationships emerge. I&rsquo;ll write down each one
along with the resulting subset relationships.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">x</span>: <span class="kt">i32</span> <span class="o">=</span><span class="w"> </span><span class="mi">22</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">v</span>: <span class="nb">Vec</span><span class="o">&lt;&amp;</span><span class="na">&#39;</span><span class="mi">0</span><span class="w"> </span><span class="kt">i32</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="n">r</span>: <span class="kp">&amp;</span><span class="na">&#39;</span><span class="mi">1</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="nb">Vec</span><span class="o">&lt;&amp;</span><span class="na">&#39;</span><span class="mi">2</span><span class="w"> </span><span class="kt">i32</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;</span><span class="mi">3</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">v</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// requires: &amp;&#39;3 mut Vec&lt;&amp;&#39;0 i32&gt; &lt;: &amp;&#39;1 mut Vec&lt;&amp;&#39;2 i32&gt;
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">//        =&gt; &#39;3: &#39;1, &#39;0: &#39;2, &#39;2: &#39;0
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="n">p</span>: <span class="kp">&amp;</span><span class="na">&#39;</span><span class="mi">5</span><span class="w"> </span><span class="kt">i32</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;</span><span class="mi">4</span><span class="w"> </span><span class="n">x</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// requires: &amp;&#39;4 i32 &lt;: &amp;&#39;5 i32
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">//        =&gt; &#39;4: &#39;5
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">r</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="n">p</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// requires: &amp;&#39;5 i32 &lt;: &amp;&#39;2 i32
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">//        =&gt; &#39;5: &#39;2
</span></span></span><span class="line"><span class="cl"><span class="w">  
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">x</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">take</span>::<span class="o">&lt;</span><span class="nb">Vec</span><span class="o">&lt;&amp;</span><span class="na">&#39;</span><span class="mi">6</span><span class="w"> </span><span class="kt">i32</span><span class="o">&gt;&gt;</span><span class="p">(</span><span class="n">v</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// requires: Vec&lt;&amp;&#39;0 i32&gt; &lt;: Vec&lt;&amp;&#39;6 i32&gt;
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">//        =&gt; &#39;0: &#39;6
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">take</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="n">p</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Ultimately, these subset relationships become input facts into the
system. For reasons that will become clear later on, I call these the
&ldquo;base subset&rdquo; relations:</p>
<pre tabindex="0"><code>.decl base_subset(R1:region, R2:region, P:point)
.input base_subset
</code></pre><p>In other words, <code>base_subset(R1, R2, P)</code> means <code>R1: R2</code> was required
to be true at the point <code>P</code>.</p>
<p>We&rsquo;ll see in a second that this <code>base_subset</code> input is only the
starting point &ndash; it tells you which relations were directly required
to begin with, but it doesn&rsquo;t tell you the full set of relations at
any point; this is because the subset relations &ldquo;accumulate&rdquo; as you
iterate, so you must ensure both the older relations <em>and</em> the newer
ones. We&rsquo;re going to define a more complete <code>subset</code> relation that
includes both, but before we can get there, we have to look at how we
define the control-flow graph.</p>
<h3 id="points-in-the-control-flow-graph">Points in the control-flow graph</h3>
<p>The control-flow graph used by this analysis is defined based on the
MIR. We define the points in the flow-graph as follows:</p>
<pre tabindex="0"><code>Point = Start(Statement) | Mid(Statement)
Statement = BBi &#39;/&#39; j
</code></pre><p>Here, the <code>Statement</code> identifies a particular statement (the <code>j</code>th
statement from the <code>i</code>th basic block). We then distinguish the <strong>start
point</strong> of a statement from the <strong>mid point</strong>. The start point is
basically &ldquo;before it has done anything&rdquo;, and the &ldquo;mid point&rdquo; is the
place where the statement is executing. As such, all the base-subset
relationships from the previous section are defined to occur at the
mid-point of their corresponding statements.</p>
<p>We define the flow in the graph using a <code>cfg_edge</code> input:</p>
<pre tabindex="0"><code>.decl cfg_edge(P:point, Q:point)
.input cfg_edge
</code></pre><p>Naturally, every start point has an edge to its corresponding mid
point.  Mid points have an edge to the start of the next statement or,
in the case of a terminator, to the start of the basic blocks that
follow.</p>
<p>(For the most part, you can ignore mid-points for now, but they become
very important later on as we integrate notions of liveness.)</p>
<h3 id="tracking-subset-relationships-across-the-graph">Tracking subset relationships across the graph</h3>
<p>Now we come to the most interesting part of the analysis: computing
the subset relations. In the interest of building intuitions, I&rsquo;m
going to start by presenting a simpler form of this than the final
analysis; then we&rsquo;ll come back and make it a bit more complex.</p>
<p>The key idea here is that the analysis doesn&rsquo;t directly compute the
values of each region variable. Instead, it computes the <strong>subset
relationships</strong> that have to hold between them at each point in the
control-flow graph. These relationships are introduced by the &ldquo;base
subset&rdquo; relationships that result from the type-check, but they are
then propagated across control-flow edges, according to the following
rule:</p>
<ul>
<li>Once a base subset relationship is introduced between two regions <code>'a: 'b</code>, it must remain true.</li>
</ul>
<p>We can define this in datalog like so. We start with a relation <code>subset</code>:</p>
<pre tabindex="0"><code>.decl subset(R1:region, R2:region, P:point)
</code></pre><p>The idea is that if <code>subset(R1, R2, P)</code> is defined, then <code>R1: R2</code> must
hold at the point <code>P</code>. We can start with the &ldquo;base subset&rdquo; relations
that are supplied by the type checker:</p>
<pre tabindex="0"><code>// Rule subset1
subset(R1, R2, P) :- base_subset(R1, R2, P).
</code></pre><p>Subset is transitive, so we can define that too:</p>
<pre tabindex="0"><code>// Rule subset2
subset(R1, R3, P) :- subset(R1, R2, P), subset(R2, R3, P).
</code></pre><p>Finally, we define a rule that propagates subset relationships across
the control-flow graph edges:<sup id="fnref:7"><a href="#fn:7" class="footnote-ref" role="doc-noteref">7</a></sup></p>
<pre tabindex="0"><code>// Rule subset3 (version 1)
subset(R1, R2, Q) :- subset(R1, R2, P), cfg_edge(P, Q).
</code></pre><p>Easy peezy, lemon squeezy, as my daughter likes to say. If we apply
these rules to our Example A, we wind up with the following subset
relationships in between each statement (I&rsquo;m only showing the
relationships at each &ldquo;start&rdquo; point here, and I&rsquo;m not showing the full
transitive closure). Note that they just keep growing:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">x</span>: <span class="kt">i32</span> <span class="o">=</span><span class="w"> </span><span class="mi">22</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// (none)
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">v</span>: <span class="nb">Vec</span><span class="o">&lt;&amp;</span><span class="na">&#39;</span><span class="mi">0</span><span class="w"> </span><span class="kt">i32</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// (none)
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="n">r</span>: <span class="kp">&amp;</span><span class="na">&#39;</span><span class="mi">1</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="nb">Vec</span><span class="o">&lt;&amp;</span><span class="na">&#39;</span><span class="mi">2</span><span class="w"> </span><span class="kt">i32</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;</span><span class="mi">3</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">v</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// &#39;3: &#39;1, &#39;0: &#39;2, &#39;2: &#39;0
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="n">p</span>: <span class="kp">&amp;</span><span class="na">&#39;</span><span class="mi">5</span><span class="w"> </span><span class="kt">i32</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;</span><span class="mi">4</span><span class="w"> </span><span class="n">x</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// &#39;3: &#39;1, &#39;0: &#39;2, &#39;2: &#39;0, &#39;4: &#39;5
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">r</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="n">p</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// &#39;3: &#39;1, &#39;0: &#39;2, &#39;2: &#39;0, &#39;4: &#39;5,
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// &#39;5: &#39;2,
</span></span></span><span class="line"><span class="cl"><span class="w">  
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">x</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// &#39;3: &#39;1, &#39;0: &#39;2, &#39;2: &#39;0, &#39;4: &#39;5,
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// &#39;5: &#39;2,
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">take</span>::<span class="o">&lt;</span><span class="nb">Vec</span><span class="o">&lt;&amp;</span><span class="na">&#39;</span><span class="mi">6</span><span class="w"> </span><span class="kt">i32</span><span class="o">&gt;&gt;</span><span class="p">(</span><span class="n">v</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// &#39;3: &#39;1, &#39;0: &#39;2, &#39;2: &#39;0, &#39;4: &#39;5,
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// &#39;5: &#39;2, &#39;0: &#39;6
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">take</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="n">p</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Consider the final set of relationships. Based on this, we can see
some interesting stuff. For example, we can see a relationship between
the region <code>'4</code> (that is, the region from the borrow of <code>x</code>) and the
region <code>'0</code> (that is, the region for the data in the vector <code>v</code>):</p>
<pre><code>'4: '5: '2: '0
</code></pre>
<p>This is basically reflecting the flow of data in your program. If you
think of each region as representing a &ldquo;set of loans&rdquo;, then this is
saying that <code>'0</code> (that is, the vector) may hold references that
derived from that <code>&amp;x</code> statement. This leads to our next piece of the
analysis.</p>
<h3 id="borrow-regions">Borrow regions</h3>
<p>So far, we introduced the <em>subset</em> relation that shows the
relationships between region variables and showed how that can be
extended to the control-flow graph. We&rsquo;re going to do the same now for
tracking which regions depend on which loans.</p>
<p>First off, we introduce a new input, called <code>borrow_region</code>:</p>
<pre tabindex="0"><code>.decl borrow_region(R:region, L:loan, P:point)
.input borrow_region
</code></pre><p>This input is defined for each borrow expression (e.g., <code>&amp;x</code> or <code>&amp;mut v</code>)
in the program. It relates the region from the borrow to the abstract
loan that is created. Here is Example A, annotated with the borrow-regions
that are created at each point:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">x</span>: <span class="kt">i32</span> <span class="o">=</span><span class="w"> </span><span class="mi">22</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">v</span>: <span class="nb">Vec</span><span class="o">&lt;&amp;</span><span class="na">&#39;</span><span class="mi">0</span><span class="w"> </span><span class="kt">i32</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="n">r</span>: <span class="kp">&amp;</span><span class="na">&#39;</span><span class="mi">1</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="nb">Vec</span><span class="o">&lt;&amp;</span><span class="na">&#39;</span><span class="mi">2</span><span class="w"> </span><span class="kt">i32</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;</span><span class="mi">3</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">v</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// borrow_region(&#39;3, L0)
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="n">p</span>: <span class="kp">&amp;</span><span class="na">&#39;</span><span class="mi">5</span><span class="w"> </span><span class="kt">i32</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;</span><span class="mi">4</span><span class="w"> </span><span class="n">x</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// borrow_region(&#39;4, L1)
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">r</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="n">p</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">x</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">take</span>::<span class="o">&lt;</span><span class="nb">Vec</span><span class="o">&lt;&amp;</span><span class="na">&#39;</span><span class="mi">6</span><span class="w"> </span><span class="kt">i32</span><span class="o">&gt;&gt;</span><span class="p">(</span><span class="n">v</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">take</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="n">p</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Like the <code>base_subset</code> relations, <code>borrow_region</code> are created at the
mid-point of the corresponding borrow statement.</p>
<h3 id="live-regions-and-loans">Live regions and loans</h3>
<p>In normal compiler parlance, a variable X is <strong>live</strong> at some point P
in the control-flow graph if <strong>its current value may be used later</strong>
(more formally, if there is some path from P to Q, where Q uses X, and
X is not assigned along that path).</p>
<p>We can make an analogous definition for regions: a region <code>'a</code> is
<strong>live</strong> at some point <code>P</code> if some reference with type <code>&amp;'a i32</code> may be
dereferenced later. For the most part, this just means that there is a
live variable <code>X</code> and that <code>'a</code> appears in the type of <code>X</code>. There is
however some subtleness about drops, since we try to be clever and
understand which regions a destructor might use and which it will not
(e.g., we know that a value of type <code>Vec&lt;&amp;'a u32&gt;</code> will not access
<code>'a</code> when it is dropped). I&rsquo;m not going into the details of how that
works here, it&rsquo;s the same as it was defined in the <a href="https://rust-lang.github.io/rfcs/2094-nll.html">NLL RFC</a>.</p>
<p>In terms of the Datalog, we can define an input <code>region_live_at</code> like so:</p>
<pre tabindex="0"><code>.decl region_live_at(R:region, P:point)
.input region_live_at
</code></pre><p>The initial values here are computed just as in the NLL RFC.</p>
<h3 id="the-requires-relation">The &ldquo;requires&rdquo; relation</h3>
<p>Now we can extend the <code>borrow_region</code> relation across the control-flow
graph.  As before, we introduce a new relation, called <code>requires</code>:</p>
<pre tabindex="0"><code>.decl requires(R:region, L:loan, P:point)
</code></pre><p>This can be read as</p>
<blockquote>
<p>The region R requires the terms of the loan L to be enforced at the point P.</p>
</blockquote>
<p>Or, to put another way:</p>
<blockquote>
<p>If the terms of the loan L are violated at the point P, then the region R is invalidated.</p>
</blockquote>
<p>(I don&rsquo;t love the name &ldquo;requires&rdquo;, but I haven&rsquo;t thought of a better one yet.)</p>
<p>The first rule says that the region for a borrow is always dependent on its
corresponding loan:</p>
<pre tabindex="0"><code>// Rule requires1
requires(R, L, P) :- borrow_region(R, L, P).
</code></pre><p>The next rule says that if <code>R1: R2</code>, then <code>R2</code> depends on any loans that <code>R1</code> depends on:</p>
<pre tabindex="0"><code>// Rule requires2
requires(R2, L, P) :- requires(R1, L, P), subset(R1, R2, P).
</code></pre><p>Finally, we can propagate these requirements across control-flow
edges, just as with subsets. But here, there is a twist:</p>
<pre tabindex="0"><code>// Rule requires3 (version 1)
requires(R, L, Q) :-
  requires(R, L, P),
  !killed(L, P),
  cfg_edge(P, Q).
</code></pre><p>This rule says that if the region <code>R</code> requires the loan <code>L</code> at <code>P</code>,
then it also requires <code>L</code> at the successor <code>Q</code> &ndash; <em>so long as <code>L</code> is
not &ldquo;killed&rdquo; at <code>P</code></em>. So what is this <code>!killed(L, P)</code> rule? The killed
input relation is defined as follows:</p>
<pre tabindex="0"><code>.decl killed(L:loan, P:point)
.input killed
</code></pre><p><code>killed(L, P)</code> is defined when the point <code>P</code> is an assignment that
overwrites one of the references whose referent was borrowed in the
loan <code>L</code>. Imagine you have something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">p</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">22</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">q</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">44</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">x</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="kt">i32</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">p</span><span class="p">;</span><span class="w"> </span><span class="c1">// `x` points at `p`
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="o">*</span><span class="n">x</span><span class="p">;</span><span class="w"> </span><span class="c1">// Loan L0, `y` points at `p` too
</span></span></span><span class="line"><span class="cl"><span class="c1">// ...
</span></span></span><span class="line"><span class="cl"><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">q</span><span class="p">;</span><span class="w"> </span><span class="c1">// `x` points at `q`; kills L0
</span></span></span></code></pre></div><p>Here, <code>x</code> initially referenced <code>p</code>, and that is copied into <code>y</code>. At
this point (where we see <code>...</code>), accessing <code>*x</code> is illegal, because
<code>y</code> has borrowed it. But then <code>x</code> is reassigned to point at <code>q</code>
instead &ndash; now accessing <code>*x</code> doesn&rsquo;t alias <code>*y</code> anymore. This is
reflected by <em>killing</em> the loan L0, thus indicating that <code>y</code> would no
longer be invalidated by accessing <code>*x</code>.</p>
<p>We can now annotate Example A to include both the <code>subset</code> relations
and the <code>requires</code> relations at each point. As before, I&rsquo;m not going
to show the full transitive closure of possibilities, but rather just
the &ldquo;base facts&rdquo;. You can see that they continue to accumulate as we
move through the program:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">x</span>: <span class="kt">i32</span> <span class="o">=</span><span class="w"> </span><span class="mi">22</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// (none)
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">v</span>: <span class="nb">Vec</span><span class="o">&lt;&amp;</span><span class="na">&#39;</span><span class="mi">0</span><span class="w"> </span><span class="kt">i32</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// (none)
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// Loan L0
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="n">r</span>: <span class="kp">&amp;</span><span class="na">&#39;</span><span class="mi">1</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="nb">Vec</span><span class="o">&lt;&amp;</span><span class="na">&#39;</span><span class="mi">2</span><span class="w"> </span><span class="kt">i32</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;</span><span class="mi">3</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">v</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// &#39;3: &#39;1, &#39;0: &#39;2, &#39;2: &#39;0
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// requires(&#39;3, L0)
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// Loan L1
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="n">p</span>: <span class="kp">&amp;</span><span class="na">&#39;</span><span class="mi">5</span><span class="w"> </span><span class="kt">i32</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;</span><span class="mi">4</span><span class="w"> </span><span class="n">x</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// &#39;3: &#39;1, &#39;0: &#39;2, &#39;2: &#39;0, &#39;4: &#39;5
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// requires(&#39;3, L0)
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// requires(&#39;4, L1)
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">r</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="n">p</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// &#39;3: &#39;1, &#39;0: &#39;2, &#39;2: &#39;0, &#39;4: &#39;5,
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// &#39;5: &#39;2,
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// requires(&#39;3, L0)
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// requires(&#39;4, L1)
</span></span></span><span class="line"><span class="cl"><span class="w">  
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">x</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// &#39;3: &#39;1, &#39;0: &#39;2, &#39;2: &#39;0, &#39;4: &#39;5,
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// &#39;5: &#39;2,
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// requires(&#39;3, L0)
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// requires(&#39;4, L1)
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">take</span>::<span class="o">&lt;</span><span class="nb">Vec</span><span class="o">&lt;&amp;</span><span class="na">&#39;</span><span class="mi">6</span><span class="w"> </span><span class="kt">i32</span><span class="o">&gt;&gt;</span><span class="p">(</span><span class="n">v</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// &#39;3: &#39;1, &#39;0: &#39;2, &#39;2: &#39;0, &#39;4: &#39;5,
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// &#39;5: &#39;2, &#39;0: &#39;6
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// requires(&#39;3, L0)
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// requires(&#39;4, L1)
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">take</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="n">p</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>In particular, consider the set of facts that hold on entry to the <code>x += 1</code>
statement:</p>
<pre tabindex="0"><code>// &#39;3: &#39;1, &#39;0: &#39;2, &#39;2: &#39;0, &#39;4: &#39;5,
// &#39;5: &#39;2,
// requires(&#39;3, L0)
// requires(&#39;4, L1)
</code></pre><p>Note that the loan L1 is a shared borrow of <code>x</code>, and <code>'4</code> requires
<code>L1</code>. Moreover, the variable <code>v</code> holds references of type <code>&amp;'0 i32</code>, and we can see that <code>'4</code> is a subset of <code>'0</code>:</p>
<pre><code>'4: '5: '2: '0
</code></pre>
<p>This implies that the references in the vector <code>v</code> would be
invalidated by mutating <code>x</code>, since that would invalidate the terms of
L1. Seeing as <code>v</code> is going to be used on the next line, that&rsquo;s a
problem &ndash; and that leads us to the final part of our rules, the
definition of an error.</p>
<h3 id="defining-an-error">Defining an &ldquo;error&rdquo;</h3>
<p>And now finally we can define what a borrow check error is. We define
an input <code>invalidates(P, L)</code>, which indicates that some access or
action at the point P invalidates the terms of the loan L:</p>
<pre tabindex="0"><code>.decl invalidates(P:point, L:loan)
.input invalidates
</code></pre><p>Next, we extend the notion of liveness from regions to <strong>loans</strong>. A
loan L is live at the point P if some live region R requires it:</p>
<pre tabindex="0"><code>.decl loan_live_at(R:region, P:point)

// Rule loan_live_at1
loan_live_at(L, P) :-
  region_live_at(R, P),
  requires(R, L, P).
</code></pre><p>Finally, it is an error if a point P invalidates a loan L while the
loan L is live:</p>
<pre tabindex="0"><code>.decl error(P:point)

// Rule error1
error(P) :-
  invalidates(P, L),
  loan_live_at(L, P).
</code></pre><h3 id="refining-constraint-propagation-with-liveness">Refining constraint propagation with liveness</h3>
<p>This is <em>almost</em> the analysis that I implemented, except for one
point. We can refine the constraint propagation slightly by taking
liveness into account, which allows us to accept a lot more programs.
Consider this example, annotated with the key facts introduced at each
point (remember, these facts propagate forward through control flow):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">22</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">44</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">p</span>: <span class="kp">&amp;</span><span class="na">&#39;</span><span class="mi">0</span><span class="w"> </span><span class="kt">i32</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;</span><span class="mi">1</span><span class="w"> </span><span class="n">x</span><span class="p">;</span><span class="w"> </span><span class="c1">// Loan L0
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// &#39;1: &#39;0
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// requires(&#39;1, L0)
</span></span></span><span class="line"><span class="cl"><span class="n">p</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;</span><span class="mi">3</span><span class="w"> </span><span class="n">y</span><span class="p">;</span><span class="w"> </span><span class="c1">// Loan L1
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// &#39;3: &#39;0
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// requires(&#39;3, L1)
</span></span></span><span class="line"><span class="cl"><span class="n">x</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// invalidates(L0)
</span></span></span><span class="line"><span class="cl"><span class="n">print</span><span class="p">(</span><span class="o">*</span><span class="n">p</span><span class="p">);</span><span class="w">
</span></span></span></code></pre></div><p>It would be nice if we could accept this program: although <code>p</code>
initially refers to <code>x</code>, it is later re-assigned to refer to <code>y</code>, so
by the time we execute <code>x += 1</code> the loan could be released. However,
under the rules I&rsquo;ve given thus far, we would reject it, because we
are steadily accumulating information. Therefore, at the point where
we do <code>x += 1</code>, we can derive that <code>requires('0, L0)</code> quite trivially.</p>
<p>The problem arises because we <em>re-assigned</em> an existing variable <code>p</code>
rather than declaring a new one. This re-uses the same region <code>'0</code>.
We <em>could</em> therefore solve this by modifying the program to use a
fresh variable:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">22</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">44</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">p</span>: <span class="kp">&amp;</span><span class="na">&#39;</span><span class="mi">0</span><span class="w"> </span><span class="kt">i32</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;</span><span class="mi">1</span><span class="w"> </span><span class="n">x</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">q</span>: <span class="kp">&amp;</span><span class="na">&#39;</span><span class="mi">4</span><span class="w"> </span><span class="kt">i32</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;</span><span class="mi">3</span><span class="w"> </span><span class="n">y</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">x</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">print</span><span class="p">(</span><span class="o">*</span><span class="n">q</span><span class="p">);</span><span class="w">
</span></span></span></code></pre></div><p>But that&rsquo;s not a very satisfying answer. Another possibility would be
to rewrite using something like SSA form, which would basically
automate that transformation above. That remains an option, but it&rsquo;s
not what I chose to do &ndash; among other things, variables in MIR are
&ldquo;places&rdquo;, and using SSA form kind of complicates that. (That is,
variables can be borrowed and assigned indirectly and so forth.)</p>
<p>What I did instead is to modify the rules that propagate subset and
requires relations between points. Previously, those rules were
defined to propagate indiscriminately. Now we modify them to only
propagate relations for regions that are live at the successor point:</p>
<pre tabindex="0"><code>// Rule subset3 (version 2)
subset(R1, R2, Q) :-
  subset(R1, R2, P),
  cfg_edge(P, Q),
  region_live_at(R1, Q), // new 
  region_live_at(R2, Q). // new

// Rule requires3 (version 2)
requires(R, L, Q) :-
  requires(R, L, P),
  !killed(L, P),
  cfg_edge(P, Q),
  region_live_at(R, Q). // new
</code></pre><p>Using these rules, our original program is accepted. The key point is
that on entry to the line <code>p = &amp;y</code>, the variable <code>p</code> is dead (its
value is about to be overwritten), and hence its region <code>'0</code> is also
dead. Therefore, the <code>requires</code> (and <code>subset</code>) constraints that affect
it do not propagate forward.</p>
<p>This improvement is also crucial to accepting the example from <a href="https://github.com/rust-lang/rust/issues/47680#issuecomment-363131420">#47680</a>,
which is rejected by the current NLL analysis:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Thing</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Thing</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">maybe_next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">Self</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">temp</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">Thing</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">loop</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">match</span><span class="w"> </span><span class="n">temp</span><span class="p">.</span><span class="n">maybe_next</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nb">Some</span><span class="p">(</span><span class="n">v</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">temp</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">v</span><span class="p">;</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nb">None</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Here, the problem is that <code>temp.maybe_next()</code> borrows <code>*temp</code>. This
borrow is returned &ndash; sometimes &ndash; through the variable <code>v</code>, and then
stored back into <code>temp</code> (replacing the value of <code>temp</code>). This means,
if you trace it out, that indeed the borrow is live around the
loop. You might think it would be &ldquo;killed&rdquo; because we reassigned temp
(and indeed it <em>should</em> be), but with the current rules it was not,
because when <code>None</code> was returned, <code>temp</code> was not reassigned. Basically
the analysis was getting tripped up with the loop.</p>
<p>Under the new rules, however, we can see that &ndash; along the <code>Some</code> path
&ndash; the loan gets killed, because <code>temp</code> is reassigned. Meanwhile &ndash;
along the <code>None</code> path &ndash; the <code>requires</code> relation is dropped, because
it is only associated with dead regions at that point. So the program
is accepted.</p>
<h3 id="top-down-vs-bottom-up-and-causal-computation">Top-down vs bottom-up and causal computation</h3>
<p>Of course, in a real compiler, knowing whether or not there are errors
is not enough. We also need to be able to report the error nicely to
the user.  The NLL RFC proposed a technique for reporting errors that
I called <a href="https://rust-lang.github.io/rfcs/2094-nll.html#leveraging-intuition-framing-errors-in-terms-of-points">three-point form</a>:</p>
<blockquote>
<p>To the extent possible, we will try to explain all errors in terms of three points:</p>
<ul>
<li>The point where the borrow occurred (B).</li>
<li>The point where the resulting reference is used (U).</li>
<li>An intervening point that might have invalidated the reference (A).</li>
</ul>
</blockquote>
<p>One of the intriguing byproducts of framing the analysis as a series
of Datalog rules is that we can extract these three points by looking
at the way we derived each error. That is, consider the error in
Example A, where we had an illegal <code>x += 1</code>. If that increment
occurred at the point <code>P</code>, we might have found the error by querying
<code>error(P)</code>. If we were using Prolog, which executes &ldquo;top-down&rdquo; (i.e.,
starting from the goal we trying to prove), then we might encounter a
proof tree like this:</p>
<pre tabindex="0"><code>error(P) :-
  invalidates(P, L1),        // input fact
  loan_live_at(L1, P) :-     // rule loan_live_at1
    region_live_at(&#39;0, P),   // input fact
    requires(&#39;0, L1, P) :-   // rule requires2
      requires(&#39;4, L1, P) :- // rule requires3
        ...
      subset(&#39;4, &#39;0, P) :-   // rule subset3
        ...
</code></pre><p>If you look over this tree, everything you need to know is in
there. There is an error at point P because (a) P invalidates L1 and
(b) L1 is live. L1 is live because <code>'0</code> is live and <code>'0</code> requires
L1. <code>'0</code> requires L1 because of <code>'4</code>&hellip;  and so on. We just need to
write some heuristics to decide what to extract out.</p>
<p>Traditionally, however, Datalog executes bottom-up &ndash; that is, it
computes all the base facts, then the facts derived from those, and so
on. This can be more efficient, but it can be wasteful if all those
facts are not ultimately needed. There are techniques for combining
top-down and bottom-up propagation (e.g., <a href="https://www.sciencedirect.com/science/article/pii/S0004370212000562">magic sets</a>); there are
also techniques for getting &ldquo;explanations&rdquo; out of Datalog &ndash;
basically, a minimal set of facts that are needed to derive a given
tuple (like <code>error(P)</code>). <a href="http://www.vldb.org/pvldb/vol9/p1137-chothia.pdf">One such technique</a> was even
<a href="https://github.com/frankmcsherry/explanation">implemented</a> and defined using <a href="https://crates.io/crates/differential-dataflow">differential-dataflow</a>, which
is great.</p>
<p>I&rsquo;ve not really done much in this direction yet &ndash; I&rsquo;m still trying to ensure
this is the analysis we want &ndash; but it seems clear that if we go this
way we should be able to get good error information out.</p>
<h3 id="questions">Questions?</h3>
<p>I&rsquo;ve opened an <a href="https://internals.rust-lang.org/t/blog-post-an-alias-based-formulation-of-the-borrow-checker/7411">thread on the Rust internals board</a> for discussion.</p>
<h3 id="thanks">Thanks</h3>
<p>I want to take a moment to say thanks to a few people and projects who
influenced this idea. First off, Frank McSherry&rsquo;s awesome
<a href="https://crates.io/crates/differential-dataflow">differential-dataflow</a> crate really did enable me to iterate a lot
faster. Very good stuff.</p>
<p>Second, I have been wondering for some time why the compiler&rsquo;s type
system seemed to operate quite differently from a traditional alias
analysis. Some time ago I had a very interesting conversation with
Lionel Parreaux about an interesting alternative approach to Rust&rsquo;s
borrow checking, where regions were regular expressions over program
paths; then later I was talking with Vytautas Astrauskas and Federico
Poli at the Rust All Hands about their efforts to integrate Rust with
the <a href="http://www.pm.inf.ethz.ch/research/viper.html">Viper static verifier</a>, which required them to re-engineer
alias relationships quite similar to the subset relation described
here. Pondering these efforts, I re-read a number of the latest papers
on alias analysis on large C programs. This, combined with a lot of
experimentation and iteration, led me here. So thanks all!</p>
<h3 id="footnotes">Footnotes</h3>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>We were still using <a href="https://github.com/rust-lang/rust/commit/50a3dd40ae8ae6494e55d5cfc29eafdb4172af52#diff-26be476d05bea7e3cd4e452d6104482dR1758">the <code>alt</code> and <code>ret</code> keywords</a>, and not yet using <code>=&gt;</code> for match arms! Neat. And still kind of inscrutable to me.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>And <a href="https://github.com/rust-lang/rust/commit/50a3dd40ae8ae6494e55d5cfc29eafdb4172af52#diff-20cc6d854aa3f056ddd3c36b7c257765R332">macros were like <code>#foo[..]</code> instead of <code>foo!(..)</code></a>. I remember pcwalton used to complain about &ldquo;squashed spiders&rdquo; all over the code.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Region is the standard term from academia, so I am adopting it by default, but it doesn&rsquo;t necessarily carry the right &ldquo;intuitions&rdquo; here. We should maybe fish about for a better term.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>Interestingly, this means that the type <code>&amp;'a u32</code> is <em>covariant</em> with respect to <code>'a</code>, whereas before it was most naturally defined as <em>contravariant</em> &ndash; that is, <em>subtypes</em> correspond to <em>smaller sets</em> (but <em>larger lifetimes</em>). Again, not a thing that really matters to Rust users, but a nice property for those delving into the type system.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>These <code>.foo</code> directives are specific to souffle, as far as I know.&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p>We employ negation in these rules, but only in a particularly trivial way &ndash; negated inputs.&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:7">
<p>This subset propagation rule is the rule that we are going to refine later.&#160;<a href="#fnref:7" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/nll" term="nll" label="NLL"/></entry><entry><title type="html">Rust pattern: Precise closure capture clauses</title><link href="https://smallcultfollowing.com/babysteps/blog/2018/04/24/rust-pattern-precise-closure-capture-clauses/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2018/04/24/rust-pattern-precise-closure-capture-clauses/</id><published>2018-04-24T00:00:00+00:00</published><updated>2018-04-24T00:00:00+00:00</updated><content type="html"><![CDATA[<p>This is the <strong>second</strong> in a series of posts about Rust compiler
errors. Each one will talk about a particular error that I got
recently and try to explain (a) why I am getting it and (b) how I
fixed it. The purpose of this series of posts is partly to explain
Rust, but partly just to gain data for myself. I may also write posts
about errors I&rsquo;m not getting &ndash; basically places where I anticipated
an error, and used a pattern to avoid it. I hope that after writing
enough of these posts, I or others will be able to synthesize some of
these facts to make intermediate Rust material, or perhaps to improve
the language itself.</p>
<p>Other posts in this series:</p>
<ul>
<li><a href="https://smallcultfollowing.com/babysteps/
/blog/2018/04/16/rust-pattern-rooting-an-rc-handle/">Rooting an rc handle</a></li>
</ul>
<h3 id="the-error-closures-capture-too-much">The error: closures capture too much</h3>
<p>In some code I am writing, I have a struct with two fields. One of
them (<code>input</code>) contains some data I am reading from; the other is some
data I am generating (<code>output</code>):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">use</span><span class="w"> </span><span class="n">std</span>::<span class="n">collections</span>::<span class="n">HashMap</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Context</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">input</span>: <span class="nc">HashMap</span><span class="o">&lt;</span><span class="nb">String</span><span class="p">,</span><span class="w"> </span><span class="kt">u32</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">output</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="kt">u32</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>I was writing a loop that would extend the output based on the input.
The exact process isn&rsquo;t terribly important, but basically for each
input value <code>v</code>, we would look it up in the input map and use <code>0</code> if
not present:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Context</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">process</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">values</span>: <span class="kp">&amp;</span><span class="p">[</span><span class="nb">String</span><span class="p">])</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">self</span><span class="p">.</span><span class="n">output</span><span class="p">.</span><span class="n">extend</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="n">values</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">.</span><span class="n">iter</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">.</span><span class="n">map</span><span class="p">(</span><span class="o">|</span><span class="n">v</span><span class="o">|</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">input</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="n">v</span><span class="p">).</span><span class="n">cloned</span><span class="p">().</span><span class="n">unwrap_or</span><span class="p">(</span><span class="mi">0</span><span class="p">)),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>However, this code <a href="https://play.rust-lang.org/?gist=62c47ef4198dbb1c8dc2a22ea7c961a0&amp;version=stable">will not compile</a>:</p>
<pre tabindex="0"><code>error[E0502]: cannot borrow `self` as immutable because `*self.output` is also borrowed as mutable
  --&gt; src/main.rs:13:22
     |
  10 |         self.output.extend(
     |         ----------- mutable borrow occurs here
 ...
  13 |                 .map(|v| self.input.get(v).cloned().unwrap_or(0)),
     |                      ^^^ ---- borrow occurs due to use of `self` in closure
     |                      |
     |                      immutable borrow occurs here
  14 |         );
     |         - mutable borrow ends here
</code></pre><p>As the various references to &ldquo;closure&rdquo; in the error may suggest, it
turns out that this error is tied to the closure I am creating in the
iterator. If I rewrite the loop to not use <code>extend</code> and an iterator,
but rather a for loop, <a href="https://play.rust-lang.org/?gist=9d212e98a66a27c4a95790b9b9c3f30d&amp;version=stable">everything builds</a>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Context</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">process</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">values</span>: <span class="kp">&amp;</span><span class="p">[</span><span class="nb">String</span><span class="p">])</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="w"> </span><span class="n">v</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="n">values</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="bp">self</span><span class="p">.</span><span class="n">output</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">input</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="n">v</span><span class="p">).</span><span class="n">cloned</span><span class="p">().</span><span class="n">unwrap_or</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>What is going on here?</p>
<h3 id="background-the-closure-desugaring">Background: The closure desugaring</h3>
<p>The problem lies in how closures are desugared by the compiler. When
you have a closure expression like this one, it corresponds to
<em>deferred code execution</em>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="o">|</span><span class="n">v</span><span class="o">|</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">input</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="n">v</span><span class="p">).</span><span class="n">cloned</span><span class="p">().</span><span class="n">unwrap_or</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span><span class="w">
</span></span></span></code></pre></div><p>That is, <code>self.input.get(v).cloned().unwrap_or(0)</code> doesn&rsquo;t execute
<em>immediately</em> &ndash; rather, it executes later, each time the closure is
called with some specific <code>v</code>. So the closure expression itself just
corresponds to creating some kind of &ldquo;thunk&rdquo; that will hold on to all
the data it is going to need when it executes &ndash; this &ldquo;thunk&rdquo; is
effectively just a special, anonymous struct. Specifically, it is a struct
with one field for each <strong>local variable</strong> that appears in the closure body;
so, something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">MyThunk</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">this</span>: <span class="kp">&amp;</span><span class="nc">self</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>where <code>MyThunk</code> is a dummy struct name. Then <code>MyThunk</code> implements
the <code>Fn</code> trait with the actual function body, but each place that we
wrote <code>self</code> it will substitute <code>self.this</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="nb">Fn</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">MyThunk</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">call</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">v</span>: <span class="kp">&amp;</span><span class="nb">String</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">u32</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">self</span><span class="p">.</span><span class="n">this</span><span class="p">.</span><span class="n">input</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="n">v</span><span class="p">).</span><span class="n">cloned</span><span class="p">().</span><span class="n">unwrap_or</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>(Note that you cannot, today, write this impl by hand, and I have
simplified the trait in various ways, but hopefully you get the idea.)</p>
<h3 id="so-what-goes-wrong">So what goes wrong?</h3>
<p>So let&rsquo;s go back to the example now and see if we can see why we are
getting an error. I will replace the closure itself with the <code>MyThunk</code>
creation that it desugars to:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Context</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">process</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">values</span>: <span class="kp">&amp;</span><span class="p">[</span><span class="nb">String</span><span class="p">])</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">self</span><span class="p">.</span><span class="n">output</span><span class="p">.</span><span class="n">extend</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="n">values</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">.</span><span class="n">iter</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">.</span><span class="n">map</span><span class="p">(</span><span class="n">MyThunk</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">this</span>: <span class="kp">&amp;</span><span class="nc">self</span><span class="w"> </span><span class="p">}),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//   ^^^^^^^^^^^^^^^^^^^^^^^
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//   really `|v| self.input.get(v).cloned().unwrap_or(0)`
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Maybe now we can see the problem more clearly; the closure wants to
hold onto a shared reference to the <strong>entire <code>self</code> variable</strong>, but
then we also want to invoke <code>self.output.extend(..)</code>, which requires a
mutable reference to <code>self.output</code>. This is a conflict! Since the
closure has shared access to the entirety of <code>self</code>, it might (in its
body) access <code>self.output</code>, but we need to be mutating that.</p>
<p>The root problem here is that the closure is capturing <code>self</code> but it
is only <strong>using</strong> <code>self.input</code>; this is because closures always
capture entire local variables. As discussed in the <a href="https://smallcultfollowing.com/babysteps/
/blog/2018/04/16/rust-pattern-rooting-an-rc-handle/">previous post in
this series</a>, the compiler only sees one function at a time,
and in particular it does not consider the closure body while checking
the closure creator.</p>
<p>To fix this, we want to refine the closure so that instead of
capturing <code>self</code> it only captures <code>self.input</code> &ndash; but how can we do that,
given that closures only capture entire local variables? The way to do that
is to introduce a local variable, <code>input</code>, and initialize it with
<code>&amp;self.input</code>. Then the closure can capture <code>input</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Context</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">process</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">values</span>: <span class="kp">&amp;</span><span class="p">[</span><span class="nb">String</span><span class="p">])</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">input</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="bp">self</span><span class="p">.</span><span class="n">input</span><span class="p">;</span><span class="w"> </span><span class="c1">// &lt;-- I added this
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">self</span><span class="p">.</span><span class="n">output</span><span class="p">.</span><span class="n">extend</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="n">values</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">.</span><span class="n">iter</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">.</span><span class="n">map</span><span class="p">(</span><span class="o">|</span><span class="n">v</span><span class="o">|</span><span class="w"> </span><span class="n">input</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="n">v</span><span class="p">).</span><span class="n">cloned</span><span class="p">().</span><span class="n">unwrap_or</span><span class="p">(</span><span class="mi">0</span><span class="p">)),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//       ----- and removed the `self.` here
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>As you can <a href="https://play.rust-lang.org/?gist=149ccc90dd732496467f43d2a44532b8&amp;version=stable">verify for yourself</a>, this code compiles.</p>
<p>To see why it works, consider again the desugared output. In the new
version, the desugared closure will capture <code>input</code>, not <code>self</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">MyThunk</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">input</span>: <span class="kp">&amp;</span><span class="nc">input</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The borrow checker, meanwhile, sees two overlapping borrows in the function:</p>
<ul>
<li><code>let input = &amp;self.input</code> &ndash; shared borrow of <code>self.input</code></li>
<li><code>self.output.extend(..)</code> &ndash; mutable borrow of <code>self.output</code></li>
</ul>
<p>No error is reported because these two borrows affect different fields
of self.</p>
<h3 id="a-more-general-pattern">A more general pattern</h3>
<p>Sometimes, when I want to be very precise, I will write closures in a
stylized way that makes it crystal clear what they are capturing.
Instead of writing <code>|v| ...</code>, I first introduce a block that creates a
lot of local variables, with the final thing in the block being a
<code>move</code> closure (<code>move</code> closures take ownership of the things they use,
instead of borrowing them from the creator). This gives complete
control over what is borrowed and how. In this case, the closure might look like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="n">input</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="bp">self</span><span class="p">.</span><span class="n">input</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">move</span><span class="w"> </span><span class="o">|</span><span class="n">v</span><span class="o">|</span><span class="w"> </span><span class="n">input</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="n">v</span><span class="p">).</span><span class="n">cloned</span><span class="p">().</span><span class="n">unwrap_or</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Or, <a href="https://play.rust-lang.org/?gist=8ea9d6acddfc11706fda29bde8550f3c&amp;version=stable">in context</a>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Context</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">process</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">values</span>: <span class="kp">&amp;</span><span class="p">[</span><span class="nb">String</span><span class="p">])</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">self</span><span class="p">.</span><span class="n">output</span><span class="p">.</span><span class="n">extend</span><span class="p">(</span><span class="n">values</span><span class="p">.</span><span class="n">iter</span><span class="p">().</span><span class="n">map</span><span class="p">({</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="kd">let</span><span class="w"> </span><span class="n">input</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="bp">self</span><span class="p">.</span><span class="n">input</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="k">move</span><span class="w"> </span><span class="o">|</span><span class="n">v</span><span class="o">|</span><span class="w"> </span><span class="n">input</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="n">v</span><span class="p">).</span><span class="n">cloned</span><span class="p">().</span><span class="n">unwrap_or</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>In effect, these <code>let</code> statements become like the <a href="https://msdn.microsoft.com/en-us/library/dd293608.aspx">&ldquo;capture clauses&rdquo;</a>
in C++, declaring how precisely variables from the environment are
captured. But they give added flexibility by also allowing us to
capture the results of small expressions, like <code>self.input</code>, instead
of local variables.</p>
<p>Another time that this pattern is useful is when you want to capture a <em>clone</em>
of some data versus the data itself:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">data</span><span class="p">.</span><span class="n">clone</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">move</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="n">do_something</span><span class="p">(</span><span class="o">&amp;</span><span class="n">data</span><span class="p">)</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="how-we-could-accept-this-code-in-the-future">How we could accept this code in the future</h3>
<p>There is actually a pending RFC, <a href="https://github.com/rust-lang/rfcs/pull/2229">RFC #2229</a>, that aims to modify
closures so that they capture entire paths rather than local
variables. There are various corner cases though that we have to be
careful of, particularly with moving closures, as we don&rsquo;t want to
change the times that destructors run and hence change the semantics
of existing code. Nonetheless, it would solve this particular case by
changing the desugaring.</p>
<p>Alternatively, if we had some way for functions to capture a refence
to a &ldquo;view&rdquo; of a struct rather than the entire thing, then closures
might be able to capture a reference to a &ldquo;view&rdquo; of <code>self</code> rather than
capturing a reference to the field <code>input</code> directly. There is some
discussion of the view idea in <a href="https://internals.rust-lang.org/t/having-mutability-in-several-views-of-a-struct/6882/2">this internals
thread</a>;
I&rsquo;ve also tinkered with the idea of merging views and traits, as
<a href="https://internals.rust-lang.org/t/fields-in-traits/6933/12">described in this internals
post</a>. I
think that once we tackle NLL and a few other pending challenges,
finding some way to express &ldquo;views&rdquo; seems like a clear way to help
make Rust more ergonomic.</p>
<h3 id="discussion">Discussion</h3>
<p>I&rsquo;ve opened <a href="https://users.rust-lang.org/t/blog-post-series-rust-patterns/20080">a users
thread</a>
to discuss this blog post (along with other Rust pattern blog posts).</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/rustpattern" term="rustpattern" label="RustPattern"/></entry><entry><title type="html">Rust pattern: Rooting an Rc handle</title><link href="https://smallcultfollowing.com/babysteps/blog/2018/04/16/rust-pattern-rooting-an-rc-handle/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2018/04/16/rust-pattern-rooting-an-rc-handle/</id><published>2018-04-16T00:00:00+00:00</published><updated>2018-04-16T00:00:00+00:00</updated><content type="html"><![CDATA[<p>I&rsquo;ve decided to do a little series of posts about Rust compiler
errors. Each one will talk about a particular error that I got
recently and try to explain (a) why I am getting it and (b) how I
fixed it. The purpose of this series of posts is partly to explain
Rust, but partly just to gain data for myself. I may also write posts
about errors I&rsquo;m not getting &ndash; basically places where I anticipated
an error, and used a pattern to avoid it. I hope that after writing
enough of these posts, I or others will be able to synthesize some of
these facts to make intermediate Rust material, or perhaps to improve
the language itself.</p>
<h3 id="the-error-rc-rooting">The error: Rc-rooting</h3>
<p>The inaugural post concerns Rc-rooting. I am currently in the midst of
editing some code. In this code, I have a big vector of data:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Data</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">vector</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">Datum</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Datum</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Many different consumers are sharing this data, but in a read-only
fashion, so the data is stored in an <code>Rc&lt;Data&gt;</code>, and each consumer has
their own handle. Here is one such consumer:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Consumer</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">data</span>: <span class="nc">Rc</span><span class="o">&lt;</span><span class="n">Data</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>In that consumer, I am trying to iterate over the data and process it,
one datum at a time:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Consumer</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">process_data</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="w"> </span><span class="n">datum</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="o">&amp;</span><span class="bp">self</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="n">vector</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="bp">self</span><span class="p">.</span><span class="n">process_datum</span><span class="p">(</span><span class="n">datum</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">process_datum</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">datum</span>: <span class="kp">&amp;</span><span class="nc">Datum</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="cm">/* ... */</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This seems reasonable enough, but <a href="https://play.rust-lang.org/?gist=e69482ca8f539f0353b0a7da5aa08b5f&amp;version=stable">when I try to compile
this</a>,
I find that I get a borrow check error:</p>
<pre tabindex="0"><code>error[E0502]: cannot borrow `*self` as mutable because `self.data` is also borrowed as immutable
  --&gt; src/main.rs:19:7
   |
18 |     for datum in &amp;self.data.vector {
   |                   ---------      - immutable borrow ends here
   |                   |
   |                   immutable borrow occurs here
19 |       self.process_datum(datum);
   |       ^^^^ mutable borrow occurs here
</code></pre><p>Why is that? Well, the borrow checker is pointing out a legitimate
concern here (though the span for &ldquo;immutable borrow ends here&rdquo; is odd,
I <a href="https://github.com/rust-lang/rust/issues/49756">filed a
bug</a>). Basically, when
I invoke <code>process_datum</code>, I am giving it both <code>&amp;mut self</code> <em>and</em> a
reference to a <code>Datum</code>; but that datum is owned by <code>self</code> &ndash; or, more
precisely, it&rsquo;s owned by a <code>Data</code>, which is in an <code>Rc</code>, and that <code>Rc</code>
is owned by <code>self</code>. This means it would be possible for
<code>process_datum</code> to cause that to get freed, e.g. by writing to <code>self.data</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">process_datum</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">datum</span>: <span class="kp">&amp;</span><span class="nc">Datum</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// Overwriting `data` field will lower the ref-count
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// on the `Rc&lt;Data&gt;`; if this is the last handle, then
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// that would cause the `Data` to be freed, in turn invalidating
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// `datum` in the caller we looked at:
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="bp">self</span><span class="p">.</span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Rc</span>::<span class="n">new</span><span class="p">(</span><span class="n">Data</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">vector</span>: <span class="nc">vec</span><span class="o">!</span><span class="p">[]</span><span class="w"> </span><span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now, of course you and I know that <code>process_datum</code> is not going to
overwrite <code>data</code>, because that data is supposed to be an immutable
input. But then again &ndash; can we say with total confidence that all
other people editing this code now and in the future know and
understand that invariant? Maybe there will be a need to swap in new
data in the future.</p>
<p>To fix this borrow checker bug, we need to ensure that mutating <code>self</code>
cannot cause <code>datum</code> to get freed. Since the data is in an <code>Rc</code>, one
easy way to do this is to get a second handle to that <code>Rc</code>, and store
it on the stack:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">process_data</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="n">clone</span><span class="p">();</span><span class="w"> </span><span class="c1">// this is new
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">for</span><span class="w"> </span><span class="n">datum</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="o">&amp;</span><span class="n">data</span><span class="p">.</span><span class="n">vector</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">self</span><span class="p">.</span><span class="n">process_datum</span><span class="p">(</span><span class="n">datum</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>If you try this, <a href="https://play.rust-lang.org/?gist=30919dcc7f2618050a1389e2c2961341&amp;version=stable">you will find the code
compiles</a>,
and with good reason: even if <code>process_datum</code> were to modify
<code>self.data</code> now, we have a second handle onto the original data, and
it will not be deallocated until the loop in <code>process_data</code> completes.</p>
<p>(Note that invoking <code>clone</code> on an <code>Rc</code>, as we do here, merely
increases the reference count; it doesn&rsquo;t do a deep clone of the
data.)</p>
<h3 id="how-the-compiler-thinks-about-this">How the compiler thinks about this</h3>
<p>OK, now that we understand intuitively what&rsquo;s going on, let&rsquo;s dive in
a bit into how the compiler&rsquo;s check works, so we can see why the code
is being rejected, and why the fixed code is accepted.</p>
<p>The first thing to remember is that the compiler checks <strong>one method
at a time</strong>, and it makes <strong>no assumptions</strong> about what other methods
may or may not do beyond what is specified in the types of their
arguments or their return type. This is a key property &ndash; it ensures
that, for example, you are free to modify the body of a function and
it won&rsquo;t cause your callers to stop compiling<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>. It also ensures
that the analysis is scalable to large programs, since adding
functions doesn&rsquo;t make checking any individual function harder (so
total time scales linearly with the number of functions<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>).</p>
<p>Next, we have to apply the borrow checker&rsquo;s basic rule: <strong>&ldquo;While some
path is shared, it cannot be mutated.&rdquo;</strong> In this case, the shared
borrow occurs in the <code>for</code> loop:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="w"> </span><span class="n">datum</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="o">&amp;</span><span class="bp">self</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="n">vector</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//           ^^^^^^^^^^^^^^^^^ shared borrow
</span></span></span></code></pre></div><p>Here, the <strong>path</strong> being borrowed is <code>self.data.vector</code>. The
compiler&rsquo;s job here is to ensure that, so long as the reference
<code>datum</code> is in use, that path <code>self.data.vector</code> is not mutated
(because mutating it could cause <code>datum</code> to be freed).</p>
<p>So, for example, it would be an error to write <code>*self = ...</code>, because
that would overwrite <code>self</code> with a new value, which might cause the
old value of <code>data</code> to be freed, which in turn would free the vector
within, which would invalidate <code>datum</code>. Similarly, writing <code>self.data = ...</code> could cause the vector to be freed as well (as we saw earlier).</p>
<p>In the actual example, we are not directly mutating <code>self</code>, but we are
invoking <code>process_datum</code>, which takes an <code>&amp;mut self</code> argument:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="w">  </span><span class="k">for</span><span class="w"> </span><span class="n">datum</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="o">&amp;</span><span class="bp">self</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="n">vector</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// ----------------- shared borrow
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">self</span><span class="p">.</span><span class="n">process_datum</span><span class="p">(</span><span class="n">datum</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//   ^^^^^^^^^^^^^ point of error
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Since <code>process_datum</code> is declared as <code>&amp;mut self</code>, invoking
<code>self.process_datum(..)</code> is treated as a potential write to <code>*self</code>
(and <code>self.data</code>), and hence an error is reported.</p>
<p>Now compare what happens after the fix. Remember that we cloned
<code>self.data</code> into a local variable and borrowed <em>that</em>:</p>
<pre tabindex="0"><code>  let data = self.data.clone();
  for datum in &amp;data.vector {
            // ^^^^^^^^^^^^ shared borrow
    self.process_datum(datum);
  }
</code></pre><p>Now that path being borrowed is <code>data.vector</code>, and so when we invoke
<code>self.process_datum(..)</code>, the compiler does not see any potential
writes to <code>data</code> (only <code>self</code>).  Therefore, no errors are
reported. Note that the compiler <em>still</em> assumes the worst about
<code>process_datum</code>: <code>process_datum</code> may mutate <code>*self</code> or
<code>self.data</code>. But even if it does so, that won&rsquo;t cause <code>datum</code> to be
freed, because it is borrowed from <code>data</code>, which is an independent
handle to the vector.</p>
<h3 id="synopsis">Synopsis</h3>
<p>Sometimes it is useful to clone the data you are iterating over into a
local variable, so that the compiler knows it will not be freed. If
the data is immutable, storing that data in an <code>Rc</code> or <code>Arc</code> makes
that clone cheap (i.e., O(1)). (Another way to make that clone cheap
is to use a <a href="https://smallcultfollowing.com/babysteps/
/blog/2018/02/01/in-rust-ordinary-vectors-are-values/">persistent collection type</a> &ndash; such as those provided by
the <a href="https://crates.io/crates/im">im</a> crate.)</p>
<p>If the data <em>is</em> mutable, there are various other patterns that you
could deploy, which I&rsquo;ll try to cover in follow-up articles &ndash; but
often it&rsquo;s best if you can get such data into a local variable,
instead of a field, so you can track it with more precision.</p>
<h3 id="how-we-could-accept-this-code-in-the-future">How we could accept this code in the future</h3>
<p>There would be various ways for the compiler to accept this code: for
example, we&rsquo;ve thought about extensions to let you declare the sets of
fields accessed by a function (and perhaps the ways in which they are
accessed), which might let you declare that <code>process_datum</code> will never
modify the <code>data</code> field.</p>
<p>I&rsquo;ve also kicked around the idea of &ldquo;immutable&rdquo; fields from time to
time, which would basically let you declare that <em>nobody</em> will
ovewrite that field, but that gets complicated in the face of
generics. For example, one can mutate the field <code>data</code> not just by
doing <code>self.data = ...</code> but by doing <code>*self = ...</code>; and the latter
might be in generic code that works for any <code>&amp;mut T</code>: this implies
we&rsquo;d have to start categorizing the types <code>T</code> into &ldquo;assignable or
not&rdquo;<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>. I suspect we would not go in this direction.</p>
<h3 id="discussion">Discussion</h3>
<p>I&rsquo;ve opened <a href="https://users.rust-lang.org/t/blog-post-series-rust-patterns/20080">a users
thread</a>
to discuss this blog post (along with other Rust pattern blog posts).</p>
<h3 id="footnotes">Footnotes</h3>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Or crash, as would happen without the compiler&rsquo;s checks.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Total time for the safety check, that is. Optimizations and other things are sometimes inter-procedural.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Interestingly, C++ does this when you have <code>const</code> fields.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/rustpattern" term="rustpattern" label="RustPattern"/></entry><entry><title type="html">Maximally minimal specialization: always applicable impls</title><link href="https://smallcultfollowing.com/babysteps/blog/2018/02/09/maximally-minimal-specialization-always-applicable-impls/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2018/02/09/maximally-minimal-specialization-always-applicable-impls/</id><published>2018-02-09T00:00:00+00:00</published><updated>2018-02-09T00:00:00+00:00</updated><content type="html"><![CDATA[<p>So
<a href="http://aturon.github.io/2018/02/09/amazing-week/">aturon wrote this beautiful post about what a good week it has been</a>.
In there, they wrote:</p>
<blockquote>
<p><strong>Breakthrough #2</strong>: @nikomatsakis had a eureka moment and figured out a
path to make specialization sound, while still supporting its most
important use cases (blog post forthcoming!). Again, this suddenly
puts specialization on the map for Rust Epoch 2018.</p>
</blockquote>
<p>Sheesh I wish they hadn&rsquo;t written that! Now the pressure is on. Well,
here goes nothing =).</p>
<p><em>Anyway</em>, I&rsquo;ve been thinking about the upcoming Rust Epoch. We&rsquo;ve been
iterating over the final list of features to be included and I think
it seems pretty exciting.  But there is one &ldquo;fancy type system&rdquo;
feature that&rsquo;s been languishing for some time:
<strong>specialization</strong>. Accepted to much fanfare as <a href="https://github.com/rust-lang/rfcs/blob/master/text/1210-impl-specialization.md">RFC 1210</a>, we&rsquo;ve
been kind of stuck since then trying to figure out how to solve an
underlying soundness challenge.</p>
<p>As aturon wrote, I <strong>think</strong> (and emphasis on think!) I may have a
solution. I call it the <strong>always applicable</strong> rule, but you might also
call it <strong>maximally minimal specialization</strong><sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>.</p>
<p>Let&rsquo;s be clear: <strong>this proposal does not support all the
specialization use cases originally envisioned</strong>. As the phrase
<em>maximally minimal</em> suggests, it works by focusing on a core set of
impls and accepting those. But that&rsquo;s better than most of its
competitors! =) Better still, it leaves a route for future expansion.</p>
<h3 id="the-soundness-problem">The soundness problem</h3>
<p>I&rsquo;ll just cover the soundness problem very briefly; Aaron wrote an
<a href="https://aturon.github.io/blog/2017/07/08/lifetime-dispatch/">excellent blog post</a> that covers the details. The crux of
the problem is that code generation wants to erase regions, but the
type checker doesn&rsquo;t. This means that we can write specialization
impls that depend on details of lifetimes, but we have no way to test
at code generation time if those more specialized impls apply. A very
simple example would be something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Trait</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Trait</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="o">&amp;</span><span class="nb">&#39;static</span><span class="w"> </span><span class="kt">str</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>At code generation time, all we know is that we have a <code>&amp;str</code> &ndash; for
<strong>some lifetime</strong>. We don&rsquo;t know if it&rsquo;s a static lifetime or not. The
type checker is supposed to have assured us that <strong>we don&rsquo;t have to
know</strong> &ndash; that this lifetime is &ldquo;big enough&rdquo; to cover all the uses of
the string.</p>
<p>My proposal would reject the specializing impl above. I basically aim
to solve this problem by guaranteeing that, just as today, code
generation <strong>doesn&rsquo;t have to care</strong> about specific lifetimes, because
it knows that &ndash; whatever they are &ndash; if there is a potentially
specializing impl, it will be applicable.</p>
<h3 id="the-always-applicable-test">The &ldquo;always applicable&rdquo; test</h3>
<p>The core idea is to change the rule for when overlap is allowed. In
<a href="https://github.com/rust-lang/rfcs/blob/master/text/1210-impl-specialization.md">RFC 1210</a> the rule is something like this:</p>
<ul>
<li>Distinct impls A and B are allowed to overlap if one of them
<em>specializes</em> the other.</li>
</ul>
<p>We have long intended to extend this via the idea of <a href="http://smallcultfollowing.com/babysteps/blog/2016/09/24/intersection-impls/">intersection impls</a>,
giving rise to a rule like:</p>
<ul>
<li>Two distinct impls A and B are allowed to overlap if, for all
types in their intersection:
<ul>
<li>there exists an applicable impl C and C <em>specializes</em> both A and B.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></li>
</ul>
</li>
</ul>
<p>My proposal is to extend that intersection rule with the <em>always
applicable</em> test. I&rsquo;m actually going to start with a simple version,
and then I&rsquo;ll discuss an important extension that makes it much more
expressive.</p>
<ul>
<li>Two distinct impls A and B are allowed to overlap if, for all
types in their intersection:
<ul>
<li>there exists an applicable impl C and C <em>specializes</em> both A and B,</li>
<li><strong>and</strong> that impl C is <em>always applicable</em>.</li>
</ul>
</li>
</ul>
<p>(We will see, by the way, that the precise definition of the
<em>specializes</em> predicate doesn&rsquo;t matter much for the purposes of my
proposal here &ndash; any partial order will do.)</p>
<h3 id="when-is-an-impl-always-applicable">When is an impl <em>always applicable</em>?</h3>
<p>Intuitively, an impl is <em>always applicable</em> if it does not impose any
additional conditions on its input types beyond that they be
well-formed &ndash; and in particular it doesn&rsquo;t impose any equality
constraints between parts of its input types. It also has to be fully
generic with respect to the lifetimes involved.</p>
<p>Actually, I think the best way to explain it is in terms of the
<strong>implied bounds</strong> proposal<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup> (<a href="https://github.com/rust-lang/rfcs/blob/master/text/2089-implied-bounds.md">RFC</a>, <a href="https://smallcultfollowing.com/babysteps/
/blog/2014/07/06/implied-bounds/">blog post</a>). The
idea is roughly this: an impl is <em>always applicable</em> if it meets three
conditions:</p>
<ul>
<li>it relies <strong>only</strong> on implied bounds,</li>
<li>it is fully generic with respect to lifetimes,</li>
<li>it doesn&rsquo;t repeat generic type parameters.</li>
</ul>
<p>Let&rsquo;s look at those three conditions.</p>
<h4 id="condition-1-relies-only-on-implied-bounds">Condition 1: Relies only on implied bounds.</h4>
<p>Here is an example of an <em>always applicable</em> impl (which could
therefore be used to specialize another impl):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Foo</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Clone</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">SomeTrait</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Foo</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// code in here can assume that `T: Clone` because of implied bounds
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Here the impl works fine, because it adds no additional bounds beyond
the <code>T: Clone</code> that is implied by the struct declaration.</p>
<p>If the <code>impl</code> adds new bounds that are not part of the struct,
however, then it is <strong>not always applicable</strong>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Foo</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Clone</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Copy</span><span class="o">&gt;</span><span class="w"> </span><span class="n">SomeTrait</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Foo</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// ^^^^^^^ new bound not declared on `Foo`,
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">//         hence *not* always applicable
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h4 id="condition-2-fully-generic-with-respect-to-lifetimes">Condition 2: Fully generic with respect to lifetimes.</h4>
<p>Each lifetime used in the impl header must be a lifetime parameter,
and each lifetime parameter can only be used once. So an impl like
this is <strong>always applicable</strong>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="p">,</span><span class="w"> </span><span class="na">&#39;b</span><span class="o">&gt;</span><span class="w"> </span><span class="n">SomeTrait</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;a</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;b</span><span class="w"> </span><span class="kt">u32</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// implied bounds let us assume that `&#39;b: &#39;a`, as well
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>But the following impls are <strong>not</strong> always applicable:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span><span class="w"> </span><span class="n">SomeTrait</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;a</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;a</span><span class="w"> </span><span class="kt">u32</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                   </span><span class="c1">//  ^^^^^^^ same lifetime used twice
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">SomeTrait</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="o">&amp;</span><span class="nb">&#39;static</span><span class="w"> </span><span class="kt">str</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="c1">//  ^^^^^^^ not a lifetime parmeter
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h4 id="condition-3-each-type-parameter-can-only-be-used-once">Condition 3: Each type parameter can only be used once.</h4>
<p>Using a type parameter more than once imposes &ldquo;hidden&rdquo; equality constraints
between parts of the input types which in turn can lead to equality constraints
between lifetimes. Therefore, an <em>always applicable</em> impl must use each
type parameter only once, like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="p">,</span><span class="w"> </span><span class="n">U</span><span class="o">&gt;</span><span class="w"> </span><span class="n">SomeTrait</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="n">T</span><span class="p">,</span><span class="w"> </span><span class="n">U</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Repeating, as here, means the impl cannot be used to specialize:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">SomeTrait</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="n">T</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">//                   ^^^^
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// `T` used twice: not always applicable
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h4 id="how-can-we-think-about-this-formally">How can we think about this formally?</h4>
<p>For each impl, we can create a Chalk goal that is provable if it is
always applicable. I&rsquo;ll define this here &ldquo;by example&rdquo;. Let&rsquo;s consider
a variant of the first example we saw:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Foo</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Clone</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Clone</span><span class="o">&gt;</span><span class="w"> </span><span class="n">SomeTrait</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Foo</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>As we saw before, this impl is <em>always applicable</em>, because the <code>T: Clone</code> where clause on the impl follows from the implied bounds of
<code>Foo&lt;T&gt;</code>.</p>
<p>The recipe to transform this into a predicate is that we want to
replace each <em>use</em> of a type/region parameter in the input types with
a universally quantified type/region (note that the two uses of the
same type parameter would be replaced with two distinct types). This
yields a &ldquo;skolemized&rdquo; set of input types T. When check if the impl
could be applied to T.</p>
<p>In the case of our example, that means we would be trying to prove
something like this:</p>
<pre tabindex="0"><code>// For each *use* of a type parameter or region in
// the input types, we add a &#39;forall&#39; variable here.
// In this example, the only spot is `Foo&lt;_&gt;`, so we
// have one:
forall&lt;A&gt; {
  // We can assume that each of the input types (using those
  // forall variables) are well-formed:
  if (WellFormed(Foo&lt;A&gt;)) {
    // Now we have to see if the impl matches. To start,
    // we create existential variables for each of the
    // impl&#39;s generic parameters:
    exists&lt;T&gt; {
      // The types in the impl header must be equal...
      Foo&lt;T&gt; = Foo&lt;A&gt;,
      // ...and the where clauses on the impl must be provable.
      T: Clone,
    }
  }
} 
</code></pre><p>Clearly, this is provable: we infer that <code>T = A</code>, and then we can
prove that <code>A: Clone</code> because it follows from
<code>WellFormed(Foo&lt;A&gt;)</code>. Now if we look at the second example, which
added <code>T: Copy</code> to the impl, we can see why we get an error. Here was
the example:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Foo</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Clone</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Copy</span><span class="o">&gt;</span><span class="w"> </span><span class="n">SomeTrait</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Foo</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// ^^^^^^^ new bound not declared on `Foo`,
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">//         hence *not* always applicable
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>That example results in a query like:</p>
<pre tabindex="0"><code>forall&lt;A&gt; {
  if (WellFormed(Foo&lt;A&gt;)) {
    exists&lt;T&gt; {
      Foo&lt;T&gt; = Foo&lt;A&gt;,
      T: Copy, // &lt;-- Not provable! 
    }
  }
} 
</code></pre><p>In this case, we fail to prove <code>T: Copy</code>, because it does not follow
from <code>WellFormed(Foo&lt;A&gt;)</code>.</p>
<p>As one last example, let&rsquo;s look at the impl that repeats a type parameter:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">SomeTrait</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="n">T</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// Not always applicable
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The query that will result follows; what is interesting here is that
the type <code>(T, T)</code> results in <em>two</em> forall variables, because it has
two distinct <em>uses</em> of a type parameter (it just happens to be one
parameter used twice):</p>
<pre tabindex="0"><code>forall&lt;A, B&gt; {
  if (WellFormed((A, B))) {
    exists&lt;T&gt; {
      (T, T) = (A, B) // &lt;-- cannot be proven
    }
  }
} 
</code></pre><h3 id="what-is-accepted">What is accepted?</h3>
<p>What this rule primarily does it allow you to specialize blanket impls
with concrete types. For example, we currently have a <code>From</code> impl
that says any type <code>T</code> can be converted to itself:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">From</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>It would be nice to be able to define an impl that allows a value of
the never type <code>!</code> to be converted into <em>any</em> type (since such a value
cannot exist in practice:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">From</span><span class="o">&lt;!&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>However, this impl overlaps with the reflexive impl. Therefore, we&rsquo;d
like to be able to provide an intersection impl defining what happens
when you convert <code>!</code> to <code>!</code> specifically:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="nb">From</span><span class="o">&lt;!&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="o">!</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>All of these impls would be legal in this proposal.</p>
<h3 id="extension-refining-always-applicable-impls-to-consider-the-base-impl">Extension: Refining <em>always applicable</em> impls to consider the base impl</h3>
<p>While it accepts some things, the <em>always applicable</em> rule can also be
quite restrictive. For example, consider this pair of impls:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// Base impl:
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">SomeTrait</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">T</span>: <span class="nb">&#39;static</span> <span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// Specializing impl:
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">SomeTrait</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="o">&amp;</span><span class="nb">&#39;static</span><span class="w"> </span><span class="kt">str</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Here, the second impl wants to specialize the first, but it is not
<em>always applicable</em>, because it specifies the <code>'static</code> lifetime. <em>And
yet,</em> it feels like this should be ok, since the base impl only
applies to <code>'static</code> things.</p>
<p>We can make this notion more formal by expanding the property to say
that the specializing impl C must be <em>always applicable</em> <strong>with
respect to the base impls</strong>. In this extended version of the
predicate, the impl C is allowed to rely not only on the <em>implied
bounds</em>, but on the <em>bounds that appear in the base impl(s)</em>.</p>
<p>So, the impls above might result in a Chalk predicate like:</p>
<pre tabindex="0"><code>// One use of a lifetime in the specializing impl (`&#39;static`),
// so we introduce one &#39;forall&#39; lifetime:
forall&lt;&#39;a&gt; {
  // Assuming the base impl applies:
  if (exists&lt;T&gt; { T = &amp;&#39;a str, T: &#39;static }) {
      // We have to prove that the
      // specialized impls type&#39;s can unify:
      &amp;&#39;a str = &amp;&#39;static str
    }
  }
} 
</code></pre><p>As it happens, the compiler today has logic that would let us deduce
that, because we know that <code>&amp;'a str: 'static</code>, then we know that <code>'a = 'static</code>, and hence we could solve this clause successfully.</p>
<p>This rule also allows us to accept some cases where type parameters
are repeated, though we&rsquo;d have to upgrade chalk&rsquo;s capability to let it
prove those predicates fully. Consider this pair of impls from
<a href="https://github.com/rust-lang/rfcs/blob/master/text/1210-impl-specialization.md">RFC 1210</a>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// Base impl:
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">E</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Extend</span><span class="o">&lt;</span><span class="n">E</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Vec</span><span class="o">&lt;</span><span class="n">E</span><span class="o">&gt;</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">T</span>: <span class="nb">IntoIterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">=</span><span class="n">E</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="o">..</span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// Specializing impl:
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="p">,</span><span class="w"> </span><span class="n">E</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Extend</span><span class="o">&lt;</span><span class="n">E</span><span class="p">,</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;a</span><span class="w"> </span><span class="p">[</span><span class="n">E</span><span class="p">]</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Vec</span><span class="o">&lt;</span><span class="n">E</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="o">..</span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">               </span><span class="c1">//  ^       ^           ^ E repeated three times!
</span></span></span></code></pre></div><p>Here the specializing impl repeats the type parameter <code>E</code> three times!
However, looking at the base impl, we can see that all of those
repeats follow from the conditions on the base impl. The resulting
chalk predicate would be:</p>
<pre tabindex="0"><code>// The fully general form of specializing impl is
// &gt; impl&lt;A,&#39;b,C,D&gt; Extend&lt;A, &amp;&#39;b [C]&gt; for Vec&lt;D&gt;
forall&lt;A, &#39;b, C, D&gt; {
  // Assuming the base impl applies:
  if (exists&lt;E, T&gt; { E = A, T = &amp;&#39;b [B], Vec&lt;D&gt; = Vec&lt;E&gt;, T: IntoIterator&lt;Item=E&gt; }) {
    // Can we prove the specializing impl unifications?
    exists&lt;&#39;a, E&gt; {
      E = A,
      &amp;&#39;a [E] = &amp;&#39;b [C],
      Vec&lt;E&gt; = Vec&lt;D&gt;,
    }
  }
} 
</code></pre><p>This predicate should be provable &ndash; but there is a definite catch.
At the moment, these kinds of predicates fall outside the &ldquo;Hereditary
Harrop&rdquo; (HH) predicates that Chalk can handle. HH predicates do not
permit existential quantification and equality predicates as
hypotheses (i.e., in an <code>if (C) { ... }</code>). I can however imagine some
quick-n-dirty extensions that would cover these particular cases, and
of course there are more powerful proving techniques out there that we
could tinker with (though I might prefer to avoid that).</p>
<h3 id="extension-reverse-implied-bounds-rules">Extension: Reverse implied bounds rules</h3>
<p>While the previous examples ought to be provable, there are some other
cases that won&rsquo;t work out without some further extension to Rust.
Consider this pair of impls:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">T</span>: <span class="nb">Clone</span> <span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Vec</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">T</span>: <span class="nb">Clone</span> <span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Can we consider this second impl to be always applicable relative to
the first? Effectively this boils down to asking whether knowing
<code>Vec&lt;T&gt;: Clone</code> allows us to deduce that <code>T: Clone</code> &ndash; and right now, we can&rsquo;t
know that. The problem is that the impls we have only go one way.
That is, given the following impl:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Clone</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Vec</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">T</span>: <span class="nb">Clone</span> <span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>we get a program clause like</p>
<pre tabindex="0"><code>forall&lt;T&gt; {
  (Vec&lt;T&gt;: Clone) :- (T: Clone)
}
</code></pre><p>but we <em>need</em> the reverse:</p>
<pre tabindex="0"><code>forall&lt;T&gt; {
  (T: Clone) :- (Vec&lt;T&gt;: Clone)
}
</code></pre><p>This is basically an extension of implied bounds; but we&rsquo;d have to be careful.
If we just create those reverse rules for every impl, then it would mean that
removing a bound from an impl is a breaking change, and that&rsquo;d be a shame.</p>
<p>We could address this in a few ways. The most obvious is that we might
permit people to annotate impls indicating that they represent minimal
conditions (i.e., that removing a bound is a breaking
change).</p>
<p>Alternatively, I feel like there is some sort of feature &ldquo;waiting&rdquo; out
there that lets us make richer promises about what sorts of trait
impls we might write in the future: this would be helpful also to
coherence, since knowing what impls will <em>not</em> be written lets us
permit more things in downstream crates.  (For example, it&rsquo;d be useful
to know that <code>Vec&lt;T&gt;</code> will <em>never</em> be <code>Copy</code>.)</p>
<h3 id="extension-designating-traits-as-specialization-predicates">Extension: Designating traits as &ldquo;specialization predicates&rdquo;</h3>
<p>However, even when we consider the base impl, and even if we have some
solution to reverse rules, we <em>still</em> can&rsquo;t cover the use case of
having &ldquo;overlapping blanket impls&rdquo;, like these two:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Skip</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">T</span>: <span class="nc">Read</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Skip</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">T</span>: <span class="nc">Read</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">Seek</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Here we have a trait <code>Skip</code> that (presumably) lets us skip forward in
a file.  We can supply one default implementation that works for any
reader, but it&rsquo;s inefficient: it would just read and discard N
bytes. It&rsquo;d be nice if we could provide a more efficient version for
those readers that implement <code>Seek</code>. Unfortunately, this second impl
is not <em>always applicable with respect to</em> the first impl &ndash; it adds a
new requirement, <code>T: Seek</code>, that does not follow from the bounds on
the first impl nor the implied bounds.</p>
<p>You might wonder why this is problematic in the first place. The danger is
that some other crate might have an impl for <code>Seek</code> that places lifetime constraints,
such as:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Seek</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="o">&amp;</span><span class="nb">&#39;static</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now at code generation time, we won&rsquo;t be able to tell if that impl
applies, since we&rsquo;ll have erased the precise region.</p>
<p>However, what we <em>could</em> do is allow the <code>Seek</code> trait to be designated
as a <strong>specialization predicate</strong> (perhaps with an attribute like
<code>#[specialization_predicate]</code>). Traits marked as specialization
predicates would be limited so that every one of their impls must be
<em>always applicable</em> (our original predicate). This basically means
that, e.g., a &ldquo;reader&rdquo; cannot <em>conditionally</em> implement <code>Seek</code> &ndash; it
has to be always seekable, or never. When determining whether an impl
is <em>always applicable</em>, we can ignore where clauses that pertain to
<code>#[specialization_predicate]</code> traits.</p>
<p>Adding a <code>#[specialization_predicate]</code> attribute to an existing trait
would be a breaking change; removing it would be one too. However, it
would be possible to take existing traits and add &ldquo;specialization
predicate&rdquo; subtraits. For example, if the <code>Seek</code> trait already existed,
we might do this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Skip</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">T</span>: <span class="nc">Read</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Skip</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">T</span>: <span class="nc">Read</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">SeekPredicate</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="cp">#[specialization_predicate]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">UnconditionalSeek</span>: <span class="nc">Seek</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">seek_predicate</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">n</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">self</span><span class="p">.</span><span class="n">seek</span><span class="p">(</span><span class="n">n</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now streams that implement seek unconditionally (probably all of them)
can add <code>impl UnconditionalSeek for MyStream { }</code> and get the
optimization.  Not as automatic as we might like, but could be worse.</p>
<h3 id="default-impls-need-not-be-always-applicable">Default impls need not be <em>always applicable</em></h3>
<p>This last example illustrates an interesting point. RFC 1210 described not
only specialization but also a more flexible form of defaults that go beyond
default methods in trait definitions. The idea was that you can define lots of defaults
using a <code>default impl</code>. So the <code>UnconditionalSeek</code> trait at the end of the last section
might also have been expressed:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cp">#[specialization_predicate]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">UnconditionalSeek</span>: <span class="nc">Seek</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">default</span><span class="w"> </span><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">Seek</span><span class="o">&gt;</span><span class="w"> </span><span class="n">UnconditionalSeek</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">seek_predicate</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">n</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">self</span><span class="p">.</span><span class="n">seek</span><span class="p">(</span><span class="n">n</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The interesting thing about default impls is that they are not (yet) a
full impl.  They only represent default methods that <em>real</em> impls can
draw upon, but users still have to write a real impl somewhere. This
means that they can be exempt from the rules about being <em>always
applicable</em> &ndash; those rules will be enforced at the real impl point.
Note for example that the default impl above is not always available,
as it depends on <code>Seek</code>, which is not an implied bound anywhere.</p>
<h3 id="conclusion">Conclusion</h3>
<p>I&rsquo;ve presented a refinement of specialization in which we impose one
extra condition on the specializing impl: not only must it be a subset
of the base impl(s) that it specializes, it must be <em>always
applicable</em>, which means basically that if we are given a set of types T where we know:</p>
<ul>
<li>the base impl was proven by the type checker to apply to T</li>
<li>the types T were proven by the type checker to be well-formed</li>
<li>and the specialized impl unifies with the lifetime-erased versions of T</li>
</ul>
<p>then we know that the specialized impl applies.</p>
<p>The beauty of this approach compared with past approaches is that it
preserves the existing role of the type checker and the code
generator. As today in Rust, the type checker always knows the full
region details, but the code generator can just ignore them, and still
be assured that all region data will be valid when it is accessed.</p>
<p>This implies for example that we don&rsquo;t need to impose the restrictions
that <a href="https://aturon.github.io/blog/2017/07/08/lifetime-dispatch/">aturon discussed in their blog post</a>: we can allow specialized
associated types to be resolved in full by the type checker as long as they are not marked
default, because there is no danger that the type checker and trans will come to different
conclusions.</p>
<h3 id="thoughts">Thoughts?</h3>
<p>I&rsquo;ve opened
<a href="https://internals.rust-lang.org/t/blog-post-maximally-minimal-specialization-always-applicable-impls/6739">an internals thread on this post</a>. I&rsquo;d
love to hear whether you see a problem with this approach. I&rsquo;d also
like to hear about use cases that you have for specialization that you
think may not fit into this approach.</p>
<h3 id="footnotes">Footnotes</h3>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>We don&rsquo;t say it so much anymore, but in the olden days of Rust, the phrase &ldquo;max min&rdquo; was very &ldquo;en vogue&rdquo;; I think we picked it up from some ES6 proposals about the class syntax.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Note: an impl is said to <em>specialize</em> itself.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Let me give a shout out here to scalexm, who recently <a href="https://github.com/rust-lang-nursery/chalk/pull/82">emerged with an elegant solution for how to model implied bounds in Chalk</a>.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/specialization" term="specialization" label="Specialization"/></entry><entry><title type="html">In Rust, ordinary vectors are values</title><link href="https://smallcultfollowing.com/babysteps/blog/2018/02/01/in-rust-ordinary-vectors-are-values/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2018/02/01/in-rust-ordinary-vectors-are-values/</id><published>2018-02-01T00:00:00+00:00</published><updated>2018-02-01T00:00:00+00:00</updated><content type="html"><![CDATA[<p>I&rsquo;ve been thinking a lot about persistent collections lately and
in particular how they relate to Rust, and I wanted to write up some
of my observations.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<h3 id="what-is-a-persistent-collection">What is a persistent collection?</h3>
<p>Traditionally, persistent collections are seen as this &ldquo;wildly
different&rdquo; way to setup your collection. Instead of having
methods like <code>push</code>, which grow a vector <strong>in place</strong>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">vec</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="n">element</span><span class="p">);</span><span class="w"> </span><span class="c1">// add element to `vec`
</span></span></span></code></pre></div><p>you have a method like <code>add</code>, which leaves the original vector alone
but returns a <strong>new vector</strong> that has been modified:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">vec2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">vec</span><span class="p">.</span><span class="n">add</span><span class="p">(</span><span class="n">element</span><span class="p">);</span><span class="w">
</span></span></span></code></pre></div><p>The key property here is that <code>vec</code> does not change. This makes
persistent collections a good fit for functional languages (as well
as, potentially, for parallelism).</p>
<h3 id="how-do-persistent-collections-work">How do persistent collections work?</h3>
<p>I won&rsquo;t go into the details of any particular design, but most of them
are based around some kind of tree. For example, if you have a vector
like <code>[1, 2, 3, 4, 5, 6]</code>, you can imagine that instead of storing
those values as one big block, you store them in some kind of tree,
the values at the leaves. In our diagram, the values are split into
two leaf nodes, and then there is a parent node with pointers to
those:</p>
<pre tabindex="0"><code> [*        *] // &lt;-- this parent node is the vector
  |        |
-----    -----
1 2 3    4 5 6
</code></pre><p>Now imagine that we want to mutate one of those values in the
vector. Say, we want to change the <code>6</code> to a <code>10</code>. This means we have
to change the right node, but we can keep using left one. Then we also
have to re-create the parent node so that it can reference the new
right node.</p>
<pre tabindex="0"><code> [*        *]   // &lt;-- original vector
  |        |    //     (still exists, unchanged)
-----    -----
1 2 3    4 5 6
-----
  |      4 5 10 // &lt;-- new copy of the right node
  |      ------
  |        |
 [*        *]   // &lt;-- the new vector
</code></pre><p>Typically speaking, in a balanced sort of tree, this means that an
insert opertion in a persistent vector tends to be O(log n) &ndash; we have
to clone some leaf and mutate it, and then we have to clone and mutate
all the parent nodes on the way up the trees. <strong>This is quite a bit
more expensive than mutating a traditional vector, which is just a
couple of CPU instructions.</strong></p>
<p>A couple of observations:</p>
<ul>
<li>If the vector is not <em>actually</em> aliased, and you <em>know</em> that it&rsquo;s
not aliased, you can often avoid these clones and just mutate the
tree in place. A bit later, I&rsquo;ll talk about an experimental,
Rust-based persistent collection library called <a href="https://docs.rs/dogged/0.2.0/dogged/struct.DVec.html"><code>DVec</code></a> which does
that. But this is hard in a typical GC-based language, since you
never know when you are aliased or not.</li>
<li>There are tons of other designs for persistent collections, some of
which are biased towards particular usage patterns. For example,
<a href="https://www.lri.fr/~filliatr/ftp/publis/puf-wml07.pdf">this paper</a> has a design oriented specifically towards
Prolog-like applications; this design uses mutation under the hood
to make O(1) insertion, but hides that from the user via the
interface. Of course, these cheap inserts come at a cost: older
copies of the data structure are expensive to use.</li>
</ul>
<h3 id="persistent-collections-makes-collections-into-values">Persistent collections makes collections into values</h3>
<p>In some cases, persistent collections make your code easier to
understand.  The reason is that they act more like &ldquo;ordinary values&rdquo;,
without their own &ldquo;identity&rdquo;. Consider this JS code, with works with
integers:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-js" data-lang="js"><span class="line"><span class="cl"><span class="kd">function</span> <span class="nx">foo</span><span class="p">()</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kd">let</span> <span class="nx">x</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kd">let</span> <span class="nx">y</span> <span class="o">=</span> <span class="nx">x</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="nx">y</span> <span class="o">+=</span> <span class="mi">1</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="nx">y</span> <span class="o">-</span> <span class="nx">x</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></div><p>Here, when we modify <code>y</code>, we don&rsquo;t expect <code>x</code> to change. This is
because <code>x</code> is just a simple value. However, if we change to use an
array:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-js" data-lang="js"><span class="line"><span class="cl"><span class="kd">function</span> <span class="nx">foo</span><span class="p">()</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kd">let</span> <span class="nx">x</span> <span class="o">=</span> <span class="p">[];</span>
</span></span><span class="line"><span class="cl">    <span class="kd">let</span> <span class="nx">y</span> <span class="o">=</span> <span class="nx">x</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="nx">y</span><span class="p">.</span><span class="nx">push</span><span class="p">(</span><span class="mi">22</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">    <span class="nx">use</span><span class="p">(</span><span class="nx">x</span><span class="p">,</span> <span class="nx">y</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></div><p>Now when I modify <code>y</code>, <code>x</code> changes too. This might be what I want, but
it might not be. And of course things can get even more confusing
when the vectors are hidden behind objects:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-js" data-lang="js"><span class="line"><span class="cl"><span class="kd">function</span> <span class="nx">foo</span><span class="p">()</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kd">let</span> <span class="nx">object</span> <span class="o">=</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="nx">field</span><span class="o">:</span> <span class="p">[]</span>
</span></span><span class="line"><span class="cl">    <span class="p">};</span>
</span></span><span class="line"><span class="cl">    <span class="p">...</span>
</span></span><span class="line"><span class="cl">    <span class="kd">let</span> <span class="nx">object2</span> <span class="o">=</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="nx">field</span><span class="o">:</span> <span class="nx">object</span><span class="p">.</span><span class="nx">field</span>
</span></span><span class="line"><span class="cl">    <span class="p">};</span>
</span></span><span class="line"><span class="cl">    <span class="p">...</span>
</span></span><span class="line"><span class="cl">    <span class="c1">// Now `object.field` and `object2.field` are
</span></span></span><span class="line"><span class="cl">    <span class="c1">// secretly linked behind the scenes.
</span></span></span><span class="line"><span class="cl">    <span class="p">...</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></div><p>Now, don&rsquo;t get me wrong, sometimes it&rsquo;s super handy that
<code>object.field</code> and <code>object2.field</code> are precisely the same vector, and
that changes to one will be reflected in the other. But other times,
it&rsquo;s not what you want; I&rsquo;ve often found that changing to use
persistent data structures can make my code cleaner and easier to
understand.</p>
<h3 id="rust-is-different">Rust is different</h3>
<p>If you&rsquo;ve ever seen one of my talks on Rust<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>, you&rsquo;ll know that
they tend to hammer on a key theme of Rust&rsquo;s design:</p>
<blockquote>
<p>Sharing and mutation: good on their own, TERRIBLE together.</p>
</blockquote>
<p>Basically, the idea is that when you have two different ways to reach
the same memory (in our last example, <code>object.field</code> and
<code>object2.field</code>), then mutation becomes a very dangerous
prospect. This is particularly true when &ndash; as in Rust &ndash; you are
trying to forego the use of a garbage collector, because suddenly it&rsquo;s
not clear who should be managing that memory. <strong>But it&rsquo;s true even
with a GC,</strong> because changes like <code>object.field.push(...)</code> may effect
more objects than you expected, leading to bugs (particularly, but not
exclusively, when working with parallel threads).</p>
<p>So what happens in Rust if we try to have two accesses to the same
vector, anyway? Let&rsquo;s go back to those JavaScript examples we just
saw, but this time in Rust. The first one, with integers, works just
the same as in JS:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">x</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">y</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">return</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">x</span><span class="p">;</span><span class="w">
</span></span></span></code></pre></div><p>But the second example, with vectors, won&rsquo;t even compile:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">x</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">y</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="o">..</span><span class="p">.);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">use</span><span class="p">(</span><span class="n">x</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="p">);</span><span class="w"> </span><span class="c1">// ERROR: use of moved value `x`
</span></span></span></code></pre></div><p>The problem is that once we do <code>y = x</code>, we have <strong>taken ownership</strong> of
<code>x</code>, and hence it can&rsquo;t be used anymore.</p>
<h3 id="in-rust-ordinary-vectors-are-values">In Rust, ordinary vectors are values</h3>
<p>This leads us to a conclusion. In Rust, the &ldquo;ordinary collections&rdquo;
that we use every day <strong>already act like values</strong>: in fact, so does
any Rust type that doesn&rsquo;t use a <code>Cell</code> or a <code>RefCell</code>. Put another
way, presuming your code compiles, you know that your vector isn&rsquo;t
being mutated from multiple paths: you could replace it with an
integer and it would behave the same. This is kind of neat.</p>
<p><strong>This implies to me that persistent collections in Rust don&rsquo;t
necessarily want to have a &ldquo;different interface&rdquo; than ordinary ones.</strong>
For example, as an experimental side project, I created a persistent
vector library called <a href="https://crates.io/crates/dogged">dogged</a><sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>. Dogged offers a vector type
called <a href="https://docs.rs/dogged/0.2.0/dogged/struct.DVec.html"><code>DVec</code></a>, which is based on the
<a href="http://hypirion.com/musings/understanding-persistent-vector-pt-1">persistent vectors offered by Clojure</a>. But if you look at
the methods that <a href="https://docs.rs/dogged/0.2.0/dogged/struct.DVec.html"><code>DVec</code></a> offers, you&rsquo;ll see they&rsquo;re kind of the
standard set (<code>push</code>, etc).</p>
<p>For example, this would be a valid use of a <code>DVec</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">DVec</span>::<span class="n">new</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">x</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="n">something</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">x</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="n">something_else</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">for</span><span class="w"> </span><span class="n">element</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="o">&amp;</span><span class="n">x</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Nonetheless, a <code>DVec</code> <em>is</em> a persistent data structure. Under the
hood, a <code>DVec</code> is implemented as a <a href="https://en.wikipedia.org/wiki/Trie">trie</a>.  It contains an <a href="https://doc.rust-lang.org/std/sync/struct.Arc.html"><code>Arc</code></a>
(ref-counted value) that refers to its internal data. When you call
<code>push</code>, we will update that <code>Arc</code> to refer to the new vector, leaving
the old data in place.</p>
<p>(As an aside, <a href="https://doc.rust-lang.org/std/sync/struct.Arc.html#method.make_mut"><code>Arc::make_mut</code></a> is a <strong>really cool</strong> method. It
basically tests the reference count of your <code>Arc</code> and &ndash; if it is 1 &ndash;
gives you unique (mutable) access to the contents. If the reference
count is <strong>not</strong> 1, then it will clone the <code>Arc</code> (and its contents) in
place, and give you a mutable reference to that clone. If you&rsquo;re
recall how persistent data structures tend to work, this is <em>perfect</em>
for updating a tree as you walk. It lets you avoid cloning in the case
where your collection is not yet aliased.)</p>
<h3 id="but-persistent-collections-are-different">But persistent collections <em>are</em> different</h3>
<p>The main difference then between a <code>Vec</code> and a <code>DVec</code> lies not in the
operations it offers, but in <strong>how much they cost</strong>. That is, when you
<code>push</code> on a standard <code>Vec</code>, it is an O(1) operation. But when you
clone, that is O(n). For a <code>DVec</code>, those costs are sort of inverted:
pushing is O(log n), but cloning is O(1).</p>
<p><strong>In particular, with a <code>DVec</code>, the <code>clone</code> operation just increments
a reference count on the internal <code>Arc</code>, whereas with an ordinary
vector, <code>clone</code> must clone of all the data.</strong> But, of course, when you do
a <code>push</code> on a <code>DVec</code>, it will clone some portion of the data as it
rebuilds the affected parts of the tree (whereas a <code>Vec</code> typically can
just write into the end of the array).</p>
<p>But this &ldquo;big O&rdquo; notation, as everyone knows, only talks about
asymptotic behavior. One problem I&rsquo;ve seen with <code>DVec</code> is that it&rsquo;s
pretty tough to compete with the standard <code>Vec</code> in terms of raw
performance. It&rsquo;s often just faster to copy a whole bunch of data than
to deal with updating trees and allocating memory. I&rsquo;ve found you have
to go to pretty extreme lengths to justify using a <code>DVec</code> &ndash; e.g.,
making tons of clones and things, and having a lot of data.</p>
<p>And, of course, it&rsquo;s not all about performance. If you are doing a
lot of clones, then a <code>DVec</code> ought to use less memory as well, since
they can share a lot of representation.</p>
<h3 id="conclusion">Conclusion</h3>
<p>I&rsquo;ve tried to illustrate here how Rust&rsquo;s ownership system offers an
intriguing blend of functional and imperative styles, through the lens
of persistent collections. <strong>That is, Rust&rsquo;s standard collections,
while implemented in the typical imperative way, actually act as if
they are &ldquo;values&rdquo;</strong>: when you assign a vector from one place to
another, if you want to keep using the original, you must <code>clone</code> it,
and that makes the new copy independent from the old one.</p>
<p>This is not a new observation. For example, in 1990, Phil Wadler wrote
a paper entitled <a href="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.55.5439&amp;rep=rep1&amp;type=pdf">&ldquo;Linear Types Can Change The World!&rdquo;</a> in
which he makes basically the exact same point, though from the
inverted perspective. Here he is saying that you can still offer a
persistent interface (e.g., a method <code>vec.add(element)</code> that returns a
new vector), but if you use linear types, you can secretly implement
it in terms of an imperative data structure (e.g.,
<code>vec.push(element)</code>) and nobody has to know.</p>
<p>In playing with <code>DVec</code>, I&rsquo;ve already found it very useful to have a
persistent vector that offers the same interface as a regular one. For
example, I was able to very easily modify the
<a href="https://crates.io/crates/ena">ena unification library</a> (which is based on a vector under the
hood) to act in either <a href="https://docs.rs/ena/0.8.0/src/ena/unify/mod.rs.html#188">persistent mode</a> (using <code>DVec</code>) or
<a href="https://docs.rs/ena/0.8.0/src/ena/unify/mod.rs.html#185">imperative mode</a> (using <code>Vec</code>). Basically the idea is to be generic
over the exact vector type, which is easy since they both offer the
same interface.</p>
<p>(As an aside, I&rsquo;d love to see some more experimentation here. For
example, I think it could be really useful to have a vector that
starts out as an ordinary vector, but changes to a persistent one
after a certain length.)</p>
<p>That said, I think there is another reason that some have taken
interest in persistent collections for Rust <em>specifically</em>. That is,
while simultaneous sharing and mutation can be a risky pattern, it is
sometimes a necessary and <em>dang useful</em> one, and Rust currently makes
it kind of unergonomic. <strong>I do think we should do things to improve
this situation, and I have some specific thoughts</strong><sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup>, but I
think that persistent vs imperative collections are kind of a
non-sequitor here. Put another way, Rust already <em>has</em> persistent
collections, they just have a particularly inefficient <code>clone</code>
operation.</p>
<h3 id="footnotes">Footnotes</h3>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>As it happens, the SLG solver that I wrote about before seems like it would really like to use persistent collections.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>If you haven&rsquo;t, I thought [this one] went pretty well.
[this one]: <a href="https://www.sics.se/nicholas-matsakis">https://www.sics.se/nicholas-matsakis</a>&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>In English, if you are &ldquo;dogged&rdquo; in pursuing your goals, you are persistent.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>Specific thoughts that will have to wait until the next blog post. Time to get my daughter up and ready for school!&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/></entry><entry><title type="html">An on-demand SLG solver for chalk</title><link href="https://smallcultfollowing.com/babysteps/blog/2018/01/31/an-on-demand-slg-solver-for-chalk/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2018/01/31/an-on-demand-slg-solver-for-chalk/</id><published>2018-01-31T00:00:00+00:00</published><updated>2018-01-31T00:00:00+00:00</updated><content type="html"><![CDATA[<p>In my last Chalk post, I talked about an experimental, SLG-based
solver that I wrote for Chalk. That particular design was based very
closely on the excellent paper
<a href="http://www.sciencedirect.com/science/article/pii/0743106694000285">&ldquo;Efficient top-down computation of queries under the well-founded semantics&rdquo;, by W. Chen, T. Swift, and D. Warren</a>. It
followed a traditional Prolog execution model: this has a lot of
strengths, but it probably wasn&rsquo;t really suitable for use in rustc.
The single biggest reason for this was that it didn&rsquo;t really know when
to stop: given a query like <code>exists&lt;T&gt; { T: Sized }</code>, it would happily
try to enumerate all sized types in the system. It was also pretty
non-obvious to me how to extend that system with things like
co-inductive predicates (needed for auto traits) and a few other
peculiarities of Rust.</p>
<p>In the last few days, I&rsquo;ve implemented a second SLG-based solver for
Chalk. This one follows a rather different design. It&rsquo;s kind of a
hybrid of Chalk&rsquo;s traditional &ldquo;recursive&rdquo; solver and the SLG-based
one, with a lot of influence from <a href="http://minikanren.org/">MiniKanren</a>. I think it&rsquo;s getting
a lot closer to the sort of solver we could use in Rustc.</p>
<p>One key aspect of its design is that it is &ldquo;on-demand&rdquo; &ndash; that is, it
tries to only do as much as work as it needs to produce the next
answer, and then stops. This means that we can generally stop it from
doing silly things like iterating over every type in the
system<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>.</p>
<p>It also works in a &ldquo;breadth-first fashion&rdquo;. This means that, for
example, it would rather produce a series of answers like
<code>[Vec&lt;?T&gt;, Rc&lt;?T&gt;, Box&lt;?T&gt;, ...]</code> that to go deep and give answers
like <code>[Vec&lt;?T&gt;, Vec&lt;Vec&lt;?T&gt;&gt;, Vec&lt;Vec&lt;Vec&lt;?T&gt;&gt;&gt;, ...]</code>. This is
particularly useful when combined with on-demand solving, since it
helps us to quickly see ambiguity and stop enumerating answers.</p>
<h3 id="details-of-how-it-works">Details of how it works</h3>
<p>As part of the <a href="https://github.com/rust-lang-nursery/chalk/pull/76">PR</a>, I wrote up a <a href="https://github.com/nikomatsakis/chalk-ndm/blob/64964db637c1ea63ecb0234326f9b57b3a9e55cb/src/solve/slg/on_demand/README.md">README</a> that tries to walk
through how query solving works in the new solver. I thought I&rsquo;d paste
that here into this blog post.</p>
<p>The basis of the solver is the <code>Forest</code>
type. A <em>forest</em> stores a collection of <em>tables</em> as well as a
<em>stack</em>. Each <em>table</em> represents the stored results of a particular
query that is being performed, as well as the various <em>strands</em>, which
are basically suspended computations that may be used to find more
answers. Tables are interdependent: solving one query may require
solving others.</p>
<p>Perhaps the easiest way to explain how the solver works is to walk
through an example. Let&rsquo;s imagine that we have the following program:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Debug</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="kt">u32</span> <span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Debug</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="kt">u32</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Rc</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">Debug</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Debug</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Rc</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">Debug</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Debug</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Vec</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now imagine that we want to find answers for the query <code>exists&lt;T&gt; { Rc&lt;T&gt;: Debug }</code>. The first step would be to u-canonicalize this query; this
is the act of giving canonical names to all the unbound inference variables based on the
order of their left-most appearance, as well as canonicalizing the universes of any
universally bound names (e.g., the <code>T</code> in <code>forall&lt;T&gt; { ... }</code>). In this case, there are no
universally bound names, but the canonical form Q of the query might look something like:</p>
<pre><code>Rc&lt;?0&gt;: Debug
</code></pre>
<p>where <code>?0</code> is a variable in the root universe U0. We would then go and
look for a table with this as the key: since the forest is empty, this
lookup will fail, and we will create a new table T0, corresponding to
the u-canonical goal Q.</p>
<p><strong>Creating a table.</strong> When we first create a table, we also initialize
it with a set of <em>initial strands</em>. A &ldquo;strand&rdquo; is kind of like a
&ldquo;thread&rdquo; for the solver: it contains a particular way to produce an
answer. The initial set of strands for a goal like <code>Rc&lt;?0&gt;: Debug</code>
(i.e., a &ldquo;domain goal&rdquo;) is determined by looking for <em>clauses</em> in the
environment. In Rust, these clauses derive from impls, but also from
where-clauses that are in scope. In the case of our example, there
would be three clauses, each coming from the program. Using a
Prolog-like notation, these look like:</p>
<pre tabindex="0"><code>(u32: Debug).
(Rc&lt;T&gt;: Debug) :- (T: Debug).
(Vec&lt;T&gt;: Debug) :- (T: Debug).
</code></pre><p>To create our initial strands, then, we will try to apply each of
these clauses to our goal of <code>Rc&lt;?0&gt;: Debug</code>. The first and third
clauses are inapplicable because <code>u32</code> and <code>Vec&lt;?0&gt;</code> cannot be unified
with <code>Rc&lt;?0&gt;</code>. The second clause, however, will work.</p>
<p><strong>What is a strand?</strong> Let&rsquo;s talk a bit more about what a strand <em>is</em>. In the code, a strand
is the combination of an inference table, an X-clause, and (possibly)
a selected subgoal from that X-clause. But what is an X-clause
(<code>ExClause</code>, in the code)? An X-clause pulls together a few things:</p>
<ul>
<li>The current state of the goal we are trying to prove;</li>
<li>A set of subgoals that have yet to be proven;</li>
<li>A set of delayed literals that we will have to revisit later;
<ul>
<li>(I&rsquo;ll ignore these for now; they are only needed to handle loops between negative goals.)</li>
</ul>
</li>
<li>A set of region constraints accumulated thus far.
<ul>
<li>(I&rsquo;ll ignore these too for now; we&rsquo;ll cover regions later on.)</li>
</ul>
</li>
</ul>
<p>The general form of an X-clause is written much like a Prolog clause,
but with somewhat different semantics:</p>
<pre><code>G :- D | L
</code></pre>
<p>where G is a goal, D is a set of delayed literals, and L is the set of
literals that must be proven (in the general case, these can be both a
goal like G but also a negated goal like <code>not { G }</code>). The idea is
that &ndash; if we are able to prove L and D &ndash; then the goal G can be
considered true.</p>
<p>In the case of our example, we would wind up creating one strand, with
an X-clause like so:</p>
<pre><code>(Rc&lt;?T&gt;: Debug) :- (?T: Debug)
</code></pre>
<p>Here, the <code>?T</code> refers to one of the inference variables created in the
inference table that accompanies the strand. (I&rsquo;ll use named variables
to refer to inference variables, and numbered variables like <code>?0</code> to
refer to variables in a canonicalized goal; in the code, however, they
are both represented with an index.)</p>
<p>For each strand, we also optionally store a <em>selected subgoal</em>. This
is the literal after the turnstile (<code>:-</code>) that we are currently trying
to prove in this strand. Initally, when a strand is first created,
there is no selected subgoal.</p>
<p><strong>Activating a strand.</strong> Now that we have created the table T0 and
initialized it with strands, we have to actually try and produce an
answer. We do this by invoking the <code>ensure_answer</code> operation on the
table: specifically, we say <code>ensure_answer(T0, A0)</code>, meaning &ldquo;ensure
that there is a 0th answer&rdquo;.</p>
<p>Remember that tables store not only strands, but also a vector of
cached answers. The first thing that <code>ensure_answer</code> does is to check
whether answer 0 is in this vector. If so, we can just return
immediately.  In this case, the vector will be empty, and hence that
does not apply (this becomes important for cyclic checks later on).</p>
<p>When there is no cached answer, <code>ensure_answer</code> will try to produce
one.  It does this by selecting a strand from the set of active
strands &ndash; the strands are stored in a <code>VecDeque</code> and hence processed
in a round-robin fashion. Right now, we have only one strand, storing
the following X-clause with no selected subgoal:</p>
<pre><code>(Rc&lt;?T&gt;: Debug) :- (?T: Debug)
</code></pre>
<p>When we activate the strand, we see that we have no selected subgoal,
and so we first pick one of the subgoals to process. Here, there is only
one (<code>?T: Debug</code>), so that becomes the selected subgoal, changing
the state of the strand to:</p>
<pre><code>(Rc&lt;?T&gt;: Debug) :- selected(?T: Debug, A0)
</code></pre>
<p>Here, we write <code>selected(L, An)</code> to indicate that (a) the literal <code>L</code>
is the selected subgoal and (b) which answer <code>An</code> we are looking for. We
start out looking for <code>A0</code>.</p>
<p><strong>Processing the selected subgoal.</strong> Next, we have to try and find an
answer to this selected goal. To do that, we will u-canonicalize it
and try to find an associated table. In this case, the u-canonical
form of the subgoal is <code>?0: Debug</code>: we don&rsquo;t have a table yet for
that, so we can create a new one, T1. As before, we&rsquo;ll initialize T1
with strands. In this case, there will be three strands, because all
the program clauses are potentially applicable. Those three strands
will be:</p>
<ul>
<li><code>(u32: Debug) :-</code>, derived from the program clause <code>(u32: Debug).</code>.
<ul>
<li>Note: This strand has no subgoals.</li>
</ul>
</li>
<li><code>(Vec&lt;?U&gt;: Debug) :- (?U: Debug)</code>, derived from the <code>Vec</code> impl.</li>
<li><code>(Rc&lt;?U&gt;: Debug) :- (?U: Debug)</code>, derived from the <code>Rc</code> impl.</li>
</ul>
<p>We can thus summarize the state of the whole forest at this point as
follows:</p>
<pre tabindex="0"><code>Table T0 [Rc&lt;?0&gt;: Debug]
  Strands:
    (Rc&lt;?T&gt;: Debug) :- selected(?T: Debug, A0)
  
Table T1 [?0: Debug]
  Strands:
    (u32: Debug) :-
    (Vec&lt;?U&gt;: Debug) :- (?U: Debug)
    (Rc&lt;?V&gt;: Debug) :- (?V: Debug)
</code></pre><p><strong>Delegation between tables.</strong> Now that the active strand from T0 has
created the table T1, it can try to extract an answer. It does this
via that same <code>ensure_answer</code> operation we saw before. In this case,
the strand would invoke <code>ensure_answer(T1, A0)</code>, since we will start
with the first answer. This will cause T1 to activate its first
strand, <code>u32: Debug :-</code>.</p>
<p>This strand is somewhat special: it has no subgoals at all. This means
that the goal is proven. We can therefore add <code>u32: Debug</code> to the set
of <em>answers</em> for our table, calling it answer A0 (it is the first
answer). The strand is then removed from the list of strands.</p>
<p>The state of table T1 is therefore:</p>
<pre tabindex="0"><code>Table T1 [?0: Debug]
  Answers:
    A0 = [?0 = u32]
  Strand:
    (Vec&lt;?U&gt;: Debug) :- (?U: Debug)
    (Rc&lt;?V&gt;: Debug) :- (?V: Debug)
</code></pre><p>Note that I am writing out the answer A0 as a substitution that can be
applied to the table goal; actually, in the code, the goals for each
X-clause are also represented as substitutions, but in this exposition
I&rsquo;ve chosen to write them as full goals, following NFTD.</p>
<p>Since we now have an answer, <code>ensure_answer(T1, A0)</code> will return <code>Ok</code>
to the table T0, indicating that answer A0 is available. T0 now has
the job of incorporating that result into its active strand. It does
this in two ways. First, it creates a new strand that is looking for
the next possible answer of T1. Next, it incorpoates the answer from
A0 and removes the subgoal. The resulting state of table T0 is:</p>
<pre tabindex="0"><code>Table T0 [Rc&lt;?0&gt;: Debug]
  Strands:
    (Rc&lt;?T&gt;: Debug) :- selected(?T: Debug, A1)
    (Rc&lt;u32&gt;: Debug) :-
</code></pre><p>We then immediately activate the strand that incorporated the answer
(the <code>Rc&lt;u32&gt;: Debug</code> one). In this case, that strand has no further
subgoals, so it becomes an answer to the table T0. This answer can
then be returned up to our caller, and the whole forest goes quiescent
at this point (remember, we only do enough work to generate <em>one</em>
answer). The ending state of the forest at this point will be:</p>
<pre tabindex="0"><code>Table T0 [Rc&lt;?0&gt;: Debug]
  Answer:
    A0 = [?0 = u32]
  Strands:
    (Rc&lt;?T&gt;: Debug) :- selected(?T: Debug, A1)

Table T1 [?0: Debug]
  Answers:
    A0 = [?0 = u32]
  Strand:
    (Vec&lt;?U&gt;: Debug) :- (?U: Debug)
    (Rc&lt;?V&gt;: Debug) :- (?V: Debug)
</code></pre><p>Here you can see how the forest captures both the answers we have
created thus far <em>and</em> the strands that will let us try to produce
more answers later on.</p>
<h3 id="conclusions">Conclusions</h3>
<p>Well, the README stops the story a bit short &ndash; it doesn&rsquo;t explain,
for example, what happens when there are cycles in the graph and so
forth. Maybe you can piece it together, though.</p>
<p>The biggest question is: is this a suitable architecture for use in
rustc? About this, I&rsquo;m not sure yet. I feel like this route is quite
promising, however, and it&rsquo;s been an interesting journey for me in any
case thus far.</p>
<p>One of the tricky things that I don&rsquo;t yet know how to resolve: under
the current setup, if our root query is generated a diverse set of
answers, we can quite easily stop asking for more (e.g., to handle
<code>exists&lt;T&gt; { T: Sized }</code>). I think this is by far the more common
scenario in Rust. However, it&rsquo;s also possible to have a query which
<em>internally</em> has to go through quite a few answers in order to produce
any results at the root level. I&rsquo;m imagining something like this:</p>
<pre tabindex="0"><code>impl&lt;T&gt; Foo for T
   where T: Bar, T: Baz,
</code></pre><p>Under the setup described here, one of these queries &ndash; let&rsquo;s say <code>T: Bar</code> &ndash; gets chosen somewhat arbitrary to begin producing answers
first. It might produce a very large number of answers, which will
then get &ldquo;fed&rdquo; to the <code>Baz</code> trait, which will effectively filter them
out. But maybe <code>T: Baz</code> is only implemented for a very few types, so
if we had chosen the other order things would have been far more
efficient. I can imagine some heuristics helping here &ndash; for example,
we might take traits like <code>Sized</code> or <code>Debug</code>, or which have very
open-ended impls &ndash; and prefer not to select them first. I <em>suspect</em> a few
simple heuristics would get us quite far.</p>
<p>Currently, my biggest concern with this design is the &ldquo;runaway
internal query&rdquo; aspect I just described. But I&rsquo;m curious if there are
other things I&rsquo;m overlooking! As ever,
<a href="https://internals.rust-lang.org/t/blog-post-an-on-demand-slg-solver-for-chalk/6676">I&rsquo;ve created an internals thread</a>, please leave comments
there if you have thoughts (also suggestions for things I should go
and read).</p>
<h3 id="footnotes">Footnotes</h3>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>The existing implementation could do better here, but the ingredients are there.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/traits" term="traits" label="Traits"/><category scheme="https://smallcultfollowing.com/babysteps/categories/chalk" term="chalk" label="Chalk"/><category scheme="https://smallcultfollowing.com/babysteps/categories/pl" term="pl" label="PL"/></entry><entry><title type="html">#Rust2018</title><link href="https://smallcultfollowing.com/babysteps/blog/2018/01/09/rust2018/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2018/01/09/rust2018/</id><published>2018-01-09T00:00:00+00:00</published><updated>2018-01-09T00:00:00+00:00</updated><content type="html"><![CDATA[<p>As part of #Rust2018, I thought I would try to writeup my own
(current) perspective. I&rsquo;ll try to keep things brief.</p>
<p>First and foremost, I think that this year we have to <strong>finish what we
started and get the &ldquo;Rust 2018&rdquo; release out the door</strong>. We did good
work in 2017: now we have to make sure the world knows it and can use
it. This primarily means we have to do stabilization work, both for
the recent features added in 2017 as well as some, ahem,
longer-running topics, like SIMD. It also means keeping up our focus
on tooling, like IDE support, rustfmt, and debugger integration.</p>
<p>Looking beyond the Rust 2018 release, <strong>we need to continue to improve
Rust&rsquo;s learning curve</strong>. This means language changes, yes, but also
improvements in tooling, error messages, documentation, and teaching
techniques. One simple but very important step: more documentation
targeting intermediate-level Rust users.</p>
<p>I think we should focus on <strong>butter-smooth (and performant!)
integration of Rust with other languages</strong>. Enabling incremental
adoption is key.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> This means projects like <a href="http://usehelix.com/">Helix</a> but also
working on <a href="https://github.com/rust-lang-nursery/rust-bindgen">bindgen</a> and improving our core FFI capabilities.</p>
<p>Caution is warranted, but I think there is room for us to pursue a
select set of advanced language features. I am thinking primarily of
<strong>const generics, procedural macros, and generic associated
types</strong>. Each of these can be a massive enabler. They also are fairly
obvious generalizations of things that the compiler currently
supports, so they don&rsquo;t come at a huge complexity cost to the
language.</p>
<p>It&rsquo;s worth emphasizing also that <strong>we are not done when it comes to
improving compiler performance</strong>. The incremental infrastructure is
working and en route to a stable compiler near you, but <strong>we need to
shoot for instantaneous build times after a small change</strong> (e.g.,
adding a <code>println!</code> to a function).</p>
<p>(To help with this, I think we should start a benchmarking group
within the compiler team (and/or the infrastruture team). This group
would be focused on establishing and analyzing important benchmarks
for both compilation time and the performance of generated code. Among
other things, this group would maintain and extend
<a href="http://perf.rust-lang.org/">the <code>perf.rust-lang.org</code> site</a>. I envision people in this group
both helping to identify bottlenecks and, when it makes sense, working
to fix them.)</p>
<p>I feel like we need to do more <strong>production user outreach</strong>. I would
really like to get to the point where we have companies other than
Mozilla paying people to work full-time on the Rust compiler and
standard library, similar to how Buoyant has done such great work for
tokio. I would also really like to be getting more regular feedback
from production users on their needs and experiences.</p>
<p>I think we should try to gather some kind of limited <strong>telemetry</strong>,
much like what <a href="http://www.jonathanturner.org/2018/01/rust2018-and-data.html">Jonathan Turner discussed</a>. I think it would
be invaluable if we had input on typical compile times that people are
experiencing or &ndash; even better &ndash; some insight into what errors they
are getting, and maybe the edits made in response to those
errors. This would obviously require opt-in and a careful attention to
privacy!</p>
<p>Finally, I think there are ways we can <strong>offer a clearer path for
contributors and in turn help grow our subteams</strong>. In general, I would
like to see the subteams do a better job of defining the initiatives
that they are working on &ndash; and, for each initiative, forming a
working group dedicated to getting it done. These &ldquo;active initiatives&rdquo;
would be readily visible, offering a clear way to simultaneously find
out what&rsquo;s going on in Rust land and how you can get involved. But
really this is a bigger topic than I can summarize in a paragraph, so
I will try to revisit it in a future blog post.</p>
<h3 id="a-specific-call-out">A specific call out</h3>
<p>If you are someone who would consider using Rust in production, or
advocating for your workplace to use Rust in production, I&rsquo;d like to
know how we could help. Are there specific features or workflows you
need? Are there materials that would help you to sell Rust to your
colleagues?</p>
<h3 id="footnotes">Footnotes</h3>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>If you&rsquo;ve had altogether too cheerful of a day, go and check out <a href="https://www.youtube.com/watch?v=CuD7SCqHB7k">Joe Duffy&rsquo;s RustConf talk on Midori</a>. That ought to sober you right up. But the takeaway here is clear: enabling incremental adoption is crucial.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/></entry><entry><title type="html">Lessons from the impl period</title><link href="https://smallcultfollowing.com/babysteps/blog/2018/01/05/lessons-from-the-impl-period/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2018/01/05/lessons-from-the-impl-period/</id><published>2018-01-05T00:00:00+00:00</published><updated>2018-01-05T00:00:00+00:00</updated><content type="html"><![CDATA[<p>So, as you likely know, we tried something new at the end of 2017. For
roughly the final quarter of the year, we essentially stopped doing
design work, and instead decided to focus on implementation &ndash; what we
called the <a href="https://blog.rust-lang.org/2017/09/18/impl-future-for-rust.html">&ldquo;impl period&rdquo;</a>. We had two goals for the impl period:
(a) get a lot of high-value implementation work done and (b) to do
that by expanding the size of our community, and making it easy for
new people to get involved. To that end, we spun up <strong>about 40 working
groups</strong>, which is really a tremendous figure when you think about it,
each of which was devoted to a particular task.</p>
<p>For me personally, this was a very exciting three months. I really
enjoyed the enthusiasm and excitement that was in the air. I also
enjoyed the opportunity to work in a group of people collectively
trying to get our goals done &ndash; one thing I&rsquo;ve found working on an
open-source project is that it is often a much more &ldquo;isolated&rdquo;
experience than working in a more traditional company. The impl period
really changed that feeling.</p>
<p>I wanted to write a brief post kind of laying out my experience and
trying to dive a bit into what <strong>I</strong> felt worked well and what did
not. <strong>I&rsquo;d very much like to hear back from others who participated
(or didn&rsquo;t). I&rsquo;ve opened up a
<a href="https://internals.rust-lang.org/t/lessons-from-the-impl-period/6485">dedicated thread on internals for discussion</a>,
please leave comments there!</strong></p>
<h3 id="tldr">TL;DR</h3>
<p>If you don&rsquo;t want to read the details, here are the major points:</p>
<ul>
<li>Overall, the impl period worked great. <strong>Having structure to the
year felt liberating</strong> and I think we should do more of it.</li>
<li>We need to <strong>grow and restructure the compiler team around the idea
of mentoring and inclusion</strong>. I think having more focused working
groups will be a key part of that.</li>
<li>We have work to do on making the compiler code base accessible,
beginning with <strong>top-down documentation</strong> but also <strong>rustdoc</strong>.</li>
<li>We need to develop <strong>skills and strategies</strong> for how to split tasks
up.</li>
<li>IRC isn&rsquo;t great, but Gitter wasn&rsquo;t either. The search for a better
chat solution continues. =)</li>
</ul>
<h3 id="worked-well-establishing-focus-and-structure-to-the-year">Worked well: establishing focus and structure to the year</h3>
<p>Working on Rust often has this kind of firehose quality: so much is going on at once.
At any one time, we are:</p>
<ul>
<li>fixing bugs in existing code,</li>
<li>developing code for new features that have been designed,</li>
<li>discussing the minutae and experience of some existing feature we may consider stabilizing,</li>
<li>designing new features and APIs via RFCs.</li>
</ul>
<p>It can get pretty exhausting to keep all that in your head at once. I
really enjoyed having a quarter to just focus on one thing &ndash;
implementing. I would like us to introduce more structure into future
years, so that we can have a time when we are just focused on design,
and so forth.</p>
<p>I also appreciated that the impl period imposed a kind of &ldquo;soft
deadline&rdquo;.  I found that helpful for defining our scope. I felt like
it ensured that difficult discussions did reach an end point.</p>
<p>That said, I don&rsquo;t think we managed this deadline especially well this
year. The final discussions were pretty frantic and it was hard &ndash; no,
impossible &ndash; to keep up with all of them (I know I certainly
couldn&rsquo;t, and I work on Rust full time). Clearly in the future we
need to manage the schedule better, and make sure that design work is
happening at a more measured pace. I think that having more structure
to the year can help with that, by ensuring that we do the design work
at the time it needs to get done.</p>
<h3 id="worked-well-newcomers-developing-key-important-features">Worked well: newcomers developing key, important features</h3>
<p>Earlier, I said that the goals of impl period were to (a) get a lot of
high-value implementation work done and (b) to do that by expanding
the size of our community. There is a bit of a tension there: if you
have some high-value new feature, there is a tendency to think that we
should have an established developer do it. After all, they know the
codebase, and they will get it done the fastest. That is (often) true,
but it is not the complete story.</p>
<p>What we wanted to do in the impl period was to focus on bringing new
people into the project. Hopefully, many of those people will stick
around, working on new projects, and eventually becoming experienced
Rust compiler developers themselves. This increases our overall
bandwidth and grows our community, making us stronger.</p>
<p>And even when people don&rsquo;t have time to keep hacking on the Rust
compiler, there are still advantages to developing through
mentoring. The fact is that coding takes a lot of time. A single
experienced developer can only really effectively code up a single
feature at a time, but they can be mentoring many people at once.</p>
<p>Still, it must be said, there are plenty of people who just enjoy
coding and who don&rsquo;t particularly want to do mentoring. So obviously
we should ensure we always have a place for experienced devs who just
want to code.</p>
<h3 id="worked-mostly-well-smaller-working-groups">Worked mostly well: smaller working groups</h3>
<p>First and foremost, a key part of our plan was breaking up tasks into
<strong>working groups</strong>. A working group was meant to be a small set of
people focused on a common goal. The hope was that having smaller
groups would make it easier for people to get involved and would also
encourage more collaboration.</p>
<p>I felt the working groups worked best when they had relatively clear
focus and an active leader: the NLL group is a good example. It was
great to see the people in the chatrooms working together and starting
to help one another out when more experienced devs weren&rsquo;t available.</p>
<p>Other working group divisions worked less well. For example, there
were a few groups in the compiler that were not specific to particular
tasks, but rather parts of the compiler pipeline: WG-compiler-front,
WG-compiler-middle, etc. Lots of people participated in those groups,
and a lot got done, but the division into groups felt a bit more
arbitrary to me. It wasn&rsquo;t always clear where to put the tasks.</p>
<p>Going forward, I continue to think there is a role for working groups,
but I think we should try to keep them focused on <strong>goals</strong>, not on
the parts of the project that they touch.</p>
<h3 id="worked-well-clear-mentoring-instructions">Worked well: clear mentoring instructions</h3>
<p>I&rsquo;ve noticed something: if you tag a bug on the Rust&rsquo;s issue tracked
as <code>E-Easy</code> and leave a comment like &ldquo;ping me on IRC&rdquo;, it can easily
sit there for years and years. But if you write some <strong>mentoring
instructions</strong> &ndash; that is, lay out the steps to take &ndash; it will be
closed, often within hours.</p>
<p>This makes total sense. You want to make sure that all the tools
people need to hack on Rust are ready and immediately available. This
way, when somebody says &ldquo;I have a few hours, let me see if I can fix a
bug in rustc&rdquo;, they can sieze the moment. If you say &ldquo;ping me on IRC&rdquo;,
then it may well be that you are not available at that time. Or that
may be intimidating. In general, every roadblock gives them a chance
to get distracted.</p>
<p>Of course, ideally mentoring doesn&rsquo;t stop at mentoring instructions.
Especially for more complex projects, I often find myself scheduling
times with people so that we can have an hour or two to discuss
directly what is going on, often with screen sharing or a voice
call. That doesn&rsquo;t always work &ndash; timezones being what they are &ndash; but
when it does, it can be a big win.</p>
<h3 id="clear-problem-lack-of-leadership-bandwidth">Clear problem: lack of leadership bandwidth</h3>
<p>One problem we encountered is that there just weren&rsquo;t enough
experienced rustc developers who were willing and able to lead up
working groups. Writing mentoring instructions is hard work. Breaking
up a big task into smaller parts is hard work. This is a problem
outside of the impl period too. It&rsquo;s hard to balance all the
maintenance, bug fixing, performance monitoring, and new feature
development work that needs to get done.</p>
<p>I don&rsquo;t see a real solution here other than growing the set of people
who hack on rustc. I think this should be a top priority for us. I
think we should try to incorporate the idea of &ldquo;contributor
accessibility&rdquo; into our workflow wherever possible. In other words,
<strong>we should have clear paths for (a) how to get started hacking on
rustc and then (b) once you&rsquo;ve gotten a few PRs under your belt, how
to keep growing</strong>. The impl period focused on (a) and it&rsquo;s clear we do
pretty well there, but have room for improvement. Part (b) is harder,
and I think we need to work on it.</p>
<h3 id="clear-problem-rustc-documentation">Clear problem: rustc documentation</h3>
<p>One problem that makes writing mentoring instructions very difficult
is that the compiler is woefully underdocumented. At the start of the
impl period, many of the basic idioms and concepts (e.g., what is &ldquo;the
HIR&rdquo; or &ldquo;the MIR&rdquo;?  what is this <code>'tcx</code> I see everywhere?) were not
written up at all. It&rsquo;s somewhat better now, but not great.</p>
<p>We also lack documentation on common workflows. How do I build the
compiler?  How do I debug things and get debug logs? How do I run an
individual test? Some of this exists, but not always in an
easy-to-find place.</p>
<p>I think we really need to work on this. I&rsquo;d like to form a working
group and focus on it early this year &ndash; but more on that later. (If you&rsquo;re
interested in the idea of helping to document the compiler, though, please contact me,
or stay tuned!)</p>
<h3 id="clear-problem-some-tasks-are-hard-to-subdivide">Clear problem: some tasks are hard to subdivide</h3>
<p>One thing we also found is that some tasks are just plain hard to
subdivide. I think a good example of this was incremental compilation:
it seems like, in principle, there ought to be a lot of things that
can be done in parallel there. And we had some success with newcomers,
for example, picking off tasks relating to testing and doing other
refactorings. I think we need to work on better strategies
here. Knowing how to structure tasks for massive participation is a
skillset &ndash; not unrelated to coding, but clearly distinct from it.  I
don&rsquo;t have answers yet, but I suspect we can gain experience with this
as a community and find best practices.</p>
<p>In the case of NLL, the model that seemed to work best was to have one
more experienced developer pushing on the &ldquo;main trunk&rdquo; of development
(myself), but actively seeking places to spin out isolated tasks into
issues that could be mentored. To avoid review and bors latecy from
slowing us down, we used a dedicated feature branch on my repo
(<code>nll-master</code>) and I would periodically open up pull requests
containing a variety of commits. This seemed to work out pretty well
&ndash; oh, and by the way, the job is not done. If you&rsquo;re still hoping to
get involved, we&rsquo;ve <a href="https://github.com/rust-lang/rust/milestone/43">still got plenty of work to do</a>. =) (Though
most of those issues do not yet have mentoring instructions.)</p>
<h3 id="mixed-bag-gitter-and-dedicated-chat-rooms">Mixed bag: gitter and dedicated chat rooms</h3>
<p>One key part of our experiment was moving from a small number of chat
rooms on IRC (e.g., <code>#rustc</code>) to dedicate rooms on Gitter, one per
working group. I had mixed feelings about this.</p>
<p>Let me start with the pros of Gitter itself:</p>
<ul>
<li><strong>Gitter means everybody has a persistent connection.</strong> It is great
to be able to send someone a message when they may or may not be
online, and get an answer sometime later.</li>
<li><strong>Gitter means everything can be easily linked from the web.</strong> I
love being able to make a link to some conversation with one click
and copy it into a GitHub issue.  I love being able to link to a
Gitter chat room very easily.</li>
<li><strong>Gitter means single sign on and only one name to remember.</strong> I
love that I can just use people&rsquo;s GitHub names, which makes it
easier for me to then correlate their pull requests, or checkout
their fork of Rust, etc.</li>
</ul>
<p>But there are some pretty big cons. Mostly having to do with Gitter
being buggy. The android client doesn&rsquo;t deliver notifications (and
maybe others as well). The IRC bridge seems to mostly work, but
sometimes people get funny names (e.g., I think the Discord bridge has
only one user?) or we hit other arbitrary limits.</p>
<p>Similarly, I felt like having dedicated rooms had pros and cons. On
the one hand, it was really helpful to me personally. I find it hard
to keep up with <code>#rustc</code> on IRC.  I liked that I could be sure to read
every message in WG-compiler-nll, but I could just skim over groups
like WG-compiler-const that I was not directly involved in.</p>
<p>On the other hand, a bigger room offers more opportunity for &ldquo;cross
talk&rdquo;.  People have told me that they like having the chance to hear
something interesting.  And others found it was hard to follow all the
rooms they were interested in.</p>
<p>Finally, I found that I personally still wound up doing a lot of
mentoring over private messages. This is not ideal, because it doesn&rsquo;t
offer visibility to the rest of the group, and you can wind up
repeating things, but &ndash; particularly when you&rsquo;re discussing
asynchronously &ndash; it&rsquo;s often the most natural way to set things up.</p>
<p>I don&rsquo;t know what&rsquo;s the ideal solution here, but I do think there&rsquo;s
going to be a role for smaller chat rooms (though probably not based
on Gitter).</p>
<h3 id="conclusion">Conclusion</h3>
<p>The impl period was awesome. We got a lot of things done. And I do
mean we: the vast majority of that work was done by newcomers to the
community, many of whom had never worked on a compiler before. I loved
the overall enthusiasm that was in the air. To me, it felt like what
open source is supposed to be like.</p>
<p>Of course, though, there are things we can do better. I hope to drill
into these more in later posts (or perhaps forum discussion), but I
think the most important thing is that we need to think carefully
about how to enable mentoring and inclusion throughout our team
structure. I think we do quite well, but we can do better &ndash; and in
particular we should think more about how to help people who have
already done a few PRs take the next step.</p>
<h3 id="advertisement">Advertisement</h3>
<p>As you may have heard, we&rsquo;re trying something new this
year. <a href="https://blog.rust-lang.org/2018/01/03/new-years-rust-a-call-for-community-blogposts.html">We&rsquo;re encouraging people to write blog posts about what they think Rust ought to focus on for 2018</a>
&ndash; if you do it, you can either tweet about it with the hashtag
#Rust2018, or else e-mail <code>community@rust-lang.org</code>. I&rsquo;m pretty
excited about this; I&rsquo;ve been enjoying reading the posts that have
arrived thus far, and I plan to write a few of my own!</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/></entry><entry><title type="html">Chalk meets SLG</title><link href="https://smallcultfollowing.com/babysteps/blog/2017/10/21/chalk-meets-slg/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2017/10/21/chalk-meets-slg/</id><published>2017-10-21T00:00:00+00:00</published><updated>2017-10-21T00:00:00+00:00</updated><content type="html"><![CDATA[<p>For the last month or so, I&rsquo;ve gotten kind of obsessed with exploring
a new evaluation model for Chalk. Specifically, I&rsquo;ve been looking at
adapting the <a href="https://link.springer.com/chapter/10.1007/3-540-48159-1_12">SLG algorithm</a>, which is used in the
<a href="http://xsb.sourceforge.net/">XSB Prolog engine</a>. I recently
<a href="https://github.com/rust-lang-nursery/chalk/pull/59">opened a PR that adds this SLG-based solver as an alternative</a>,
and this blog post is an effort to describe how that PR works, and
explore some of the advantages and disadvantages I see in this
approach relative to
<a href="http://smallcultfollowing.com/babysteps/blog/2017/09/12/tabling-handling-cyclic-queries-in-chalk/">the current solver that I described in my previous post</a>.</p>
<h3 id="tldr">TL;DR</h3>
<p>For those who don&rsquo;t want to read all the details, let me highlight the
things that excite me most about the new solver:</p>
<ul>
<li>There is a very strong caching story based on tabling.</li>
<li>It handles negative reasoning very well, which is important for coherence.</li>
<li>It guarantees termination without relying on overflow, but rather a
notion of maximum size.</li>
<li>There is a lot of work on how to execute SLG-based designs very
efficiently (including virtual machine designs).</li>
</ul>
<p>However, I also have some concerns. For one thing, we have to figure
out how to include coinductive reasoning for auto traits and a few
other extensions. Secondly, the solver as designed always enumerates
all possible answers up to a maximum size, and I am concerned that in
practice this will be very wasteful. I suspect both of these problems
can be solved with some tweaks.</p>
<h3 id="what-is-this-slg-algorithm-anyway">What is this SLG algorithm anyway?</h3>
<p>There is a lot of excellent work exploring the SLG algorithm and
extensions to it. In this blog post I will just focus on the
particular variant that I implemented for Chalk, which was heavily
based on this paper
<a href="https://ac.els-cdn.com/0743106694000285/1-s2.0-0743106694000285-main.pdf?_tid=f8beb358-b642-11e7-b052-00000aacb35f&amp;acdnat=1508578621_12290e1834d94c48d36219f58be6e87f">&ldquo;Efficient Top-Down Computation of Queries Under the Well-formed Semantics&rdquo; by Chen, Swift, and Warren (JLP &lsquo;95)</a>,
though with some extensions from other work (and some of my own).</p>
<p>Like a traditional Prolog solver, this new solver explores
<a href="#all-possibilities-depth-first-tuple-at-a-time"><strong>all possibilities in a depth-first, tuple-at-a-time fashion</strong></a>,
though with some extensions to
<a href="#guaranteed-termination"><strong>guarantee termination</strong></a><sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>. Unlike a
traditional Prolog solver, however, it natively incorporates
<a href="#tabling"><strong>tabling</strong></a> and has a strong story for
<a href="#negative-reasoning-and-the-well-founded-semantics"><strong>negative reasoning</strong></a>. In the rest of the post, I will go into
each of those bolded terms in more detail (or you can click on one of
them to jump directly to the corresponding section).</p>
<h3 id="all-possibilities-depth-first-tuple-at-a-time">All possibilities, depth-first, tuple-at-a-time</h3>
<p>One important property of the new SLG-based solver is that it, like
traditional Prolog solvers, is <strong>complete</strong>, meaning that it
will find <strong>all possible answers</strong> to any query<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>. Moreover, like
Prolog solvers, it searches for those answers in a so-called
<strong>depth-first, tuple-at-a-time</strong> fashion. What this means is that,
when we have two subgoals to solve, we will fully explore the
implications of one answer through multiple subgoals before we turn to
the next answer. This stands in contrast to our current solver, which
rather breaks down goals into subgoals and processes each of them
entirely before turning to the next. As I&rsquo;ll show you now, our current
solver can sometimes fail to find solutions as a result (but, as I&rsquo;ll
also discuss, our current solver&rsquo;s approach has advantages too).</p>
<p>Let me give you an example to make it more concrete. Imagine this
program:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// sour-sweet.chalk
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Sour</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Sweet</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Vinegar</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Lemon</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Sugar</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Sour</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Vinegar</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Sour</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Lemon</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Sweet</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Lemon</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Sweet</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Sugar</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now imagine that we had a query like:</p>
<pre tabindex="0"><code>exists&lt;T&gt; { T: Sweet, T: Sour }
</code></pre><p>That is, find me some type <code>T</code> that is both sweet and
sour. If we plug this into Chalk&rsquo;s current solver, it gives back an
&ldquo;ambiguous&rdquo; result (this is running on <a href="https://github.com/rust-lang-nursery/chalk/pull/59">my PR</a>):</p>
<pre tabindex="0"><code>&gt; cargo run -- --program=sour-sweet.chalk
?- exists&lt;T&gt; { T: Sour, T: Sweet }
Ambiguous; no inference guidance
</code></pre><p>This is because of the way that our solver handles such compound
queries; specifially, the way it breaks them down into individual
queries and performs each one recursively, always looking for a
<strong>unique</strong> result. In this case, it would first ask &ldquo;is there a unique
type <code>T</code> that is <code>Sour</code>?&rdquo;  Of course, the answer is no &ndash; there are
two such types. Then it asks about <code>Sweet</code>, and gets the same
answer. This leaves it with nowhere to go, so the final result is
&ldquo;ambiguous&rdquo;.</p>
<p>The SLG solver, in contrast, tries to <strong>enumerate</strong> individual answers
and see them all the way through. If we ask it the same query, we see
that it indeed <strong>finds</strong> the unique answer <code>Lemon</code> (note the use of <code>--slg</code>
in our <code>cargo run</code> command to enable the SLG-based solver):</p>
<pre tabindex="0"><code>&gt; cargo run -- --program=sour-sweet.chalk --slg
?- exists&lt;T&gt; { T: Sour, T: Sweet }     
1 answer(s) found:
- ?0 := Lemon
</code></pre><p>This result is saying that the value for the 0th (i.e., first)
existential variable in the query (i.e., <code>T</code>) is <code>Lemon</code>.<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></p>
<p>In general, the way that the SLG solver proceeds is kind of like a sort of
loop. To solve a query like <code>exists&lt;T&gt; { T: Sour, T: Sweet }</code>, it is
sort of doing something like this:</p>
<pre tabindex="0"><code>for T where (T: Sour) {
  if (T: Sweet) {
    report_answer(T);
  }
}
</code></pre><p>(The actual struct is a bit complex because of the possibility of
cycles; this is where <strong>tabling</strong>, the subject of a later section,
comes in, but this will do for now.)</p>
<p>As we have seen, a tuple-at-a-time strategy finds answers that our
current strategy, at least, does not. If we adopted this strategy
wholesale, this could have a very concrete impact on what the Rust
compiler is able to figure out. Consider these two functions, for
example (assuming that the traits and structs we declared earlier are
still in scope):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="n">vec</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">_</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">//           ^
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">//           |
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// NB: We left the element type of this vector
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// unspecified, so the compiler must infer it.
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">bar</span><span class="p">(</span><span class="n">vec</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">//   ^
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">//   |
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// This effectively generates the two constraints
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">//
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">//     ?T: Sweet
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">//     ?T: Sour
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">//
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// where `?T` is the element type of our vector.
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">bar</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">Sweet</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">Sour</span><span class="o">&gt;</span><span class="p">(</span><span class="n">x</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Here, we wind up creating the very sort of constraint I was talking
about earlier. rustc today, which follows a chalk-like strategy, will
<a href="https://play.rust-lang.org/?gist=66b525d27e973a07a9a8219e8fec9e6c&amp;version=stable">fail compilation</a>,
demanding a type annotation:</p>
<pre tabindex="0"><code>error[E0282]: type annotations needed
  --&gt; src/main.rs:15:21
     |
  15 |   let vec: Vec&lt;_&gt; = vec![];
     |       ---           ^^^^^^ cannot infer type for `T`
     |       |
     |       consider giving `vec` a type
</code></pre><p>An SLG-based solver of course could find a unique answer here. (Also,
rustc could give a more precise error message here regarding <em>which</em>
type you ought to consider giving.)</p>
<p>Now, you might ask, is this a <strong>realistic</strong> example? In other words,
here there happens to be a single type that is both <code>Sour</code> and
<code>Sweet</code>, but how often does that happen in practice? Indeed, I expect
the answer is &ldquo;quite rarely&rdquo;, and thus the extra expressiveness of the
tuple-at-a-time approach is probably not that useful in practice. (In
particular, the type-checker does not want to &ldquo;guess&rdquo; types on your
behalf, so unless we can find a single, unique answer, we don&rsquo;t
typically care about the details of the result.) Still, I could
imagine that in some narrow circumstances, especially in crates like
<a href="http://diesel.rs/">Diesel</a> that use traits as a complex form of
meta-programming, this extra expressiveness may be of use. (And of
course having the trait solver fail to find answers that exist kind of
sticks in your craw a bit.)</p>
<p>There are some other potential downsides to the tuple-at-a-time
approach. For example, there may be an awfully large number of types
that implement <code>Sweet</code>, and we are going to effectively enumerate them
all while solving. In fact, there might even be an <strong>infinite</strong> set of
types! That brings me to my next point.</p>
<h3 id="guaranteed-termination">Guaranteed termination</h3>
<p>Imagine we extended our previous program with something like a type
<code>HotSauce&lt;T&gt;</code>. Naturally, if you add hot sauce to something sour, it
remains sour, so we can also include a trait impl to that effect:</p>
<pre tabindex="0"><code>struct HotSauce&lt;T&gt; { }
impl&lt;T&gt; Sour for HotSauce&lt;T&gt; where T: Sour { }
</code></pre><p>Now if we have the query <code>exists&lt;T&gt; { T: Sour }</code>, there are actually
an infinite set of answers. Of course we can have <code>T = Vinegar</code> and <code>T = Lemon</code>. And we can have <code>T = HotSauce&lt;Vinegar&gt;</code> and <code>T = HotSauce&lt;Lemon&gt;</code>. But we can also have <code>T = HotSauce&lt;HotSauce&lt;Lemon&gt;&gt;</code>.
Or, for the real hot-sauce enthusiast<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup>, we might have:</p>
<pre><code>T = HotSauce&lt;HotSauce&lt;HotSauce&lt;HotSauce&lt;Lemon&gt;&gt;&gt;&gt;
</code></pre>
<p>In fact, we might have an infinite number of <code>HotSauce</code> types wrapping
either <code>Lemon</code> or <code>Vinegar</code>.</p>
<p>This poses a challenge to the SLG solver. After all, it tries to
enumerate <strong>all</strong> answers, but in this case there are an infinite
number! The way that we handle this is basically by imposing a
<strong>maximum size</strong> on our answers. You could measure size various ways. A common choice is to use depth,
but the total size of a type can still grow exponentially relative to
the depth, so I am instead limiting the maximum size of the tree as a whole.
So, for example,
our really long answer had a size of 5:</p>
<pre><code>T = HotSauce&lt;HotSauce&lt;HotSauce&lt;HotSauce&lt;Lemon&gt;&gt;&gt;&gt;
</code></pre>
<p>The idea then is that once an answer exceeds that size, we start to
<strong>approximate</strong> the answer by introducing variables.<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup> In this
case, if we imposed a maximum size of 3, we might transform that
answer into:</p>
<pre><code>exists&lt;U&gt; { T = HotSauce&lt;HotSauce&lt;U&gt;&gt; }
</code></pre>
<p>The original answer is an <em>instance</em> of this &ndash; that is, we can
substitute <code>U = HotSauce&lt;HotSauce&lt;Lemon&gt;&gt;</code> to recover it.</p>
<p>Now, when we introduce variables into answers like this, we lose some
precision. We can now only say that <code>exists&lt;U&gt; { T = HotSauce&lt;HotSauce&lt;U&gt;&gt; }</code> <strong>might</strong> be an answer, we can&rsquo;t say for
sure. It&rsquo;s a kind of &ldquo;ambiguous&rdquo; answer<sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup>.</p>
<p>So let&rsquo;s see it in action. If I invoke the SLG solver using a maximum
size of 3, I get the following:<sup id="fnref:7"><a href="#fn:7" class="footnote-ref" role="doc-noteref">7</a></sup></p>
<pre tabindex="0"><code>&gt; cargo run -- --program=sour-sweet.chalk --slg --overflow-depth=3
7 answer(s) found:
- ?0 := Vinegar
- ?0 := Lemon
- ?0 := HotSauce&lt;Vinegar&gt;
- ?0 := HotSauce&lt;Lemon&gt;
- exists&lt;U0&gt; { ?0 := HotSauce&lt;HotSauce&lt;?0&gt;&gt; } [ambiguous]
- ?0 := HotSauce&lt;HotSauce&lt;Vinegar&gt;&gt;
- ?0 := HotSauce&lt;HotSauce&lt;Lemon&gt;&gt;
</code></pre><p>Notice that middle answer:</p>
<pre tabindex="0"><code>- exists&lt;U0&gt; { ?0 := HotSauce&lt;HotSauce&lt;?0&gt;&gt; } [ambiguous]
</code></pre><p>This is precisely the point where the abstraction mechanism kicked in,
introducing a variable. Note that the two instances of <code>?0</code> here refer
to different variables &ndash; the first one, in the &ldquo;key&rdquo;, refers to the
0th variable in our original query (what I&rsquo;ve been calling <code>T</code>). The
second <code>?0</code>, in the &ldquo;value&rdquo; refers, to the variable introduced by the
<code>exists&lt;&gt;</code> quantifier (the <code>U0</code> is the &ldquo;universe&rdquo; of that variable,
which has to do with higher-ranked things and I won&rsquo;t get into here).
Finally, you can see that we flagged this result as <code>[ambiguous]</code>,
because we had to truncate it to make it fit the maximum size.</p>
<p>Truncating answers isn&rsquo;t on its own enough to guarantee termination.
It&rsquo;s also possible to setup an ever-growing number of <strong>queries</strong>.
For example, one could write something like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">HotSauce</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span>: <span class="nc">Foo</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>If we try to solve (say) <code>Lemon: Foo</code>, we will then have to solve
<code>HotSauce&lt;Lemon&gt;</code>, and <code>HotSauce&lt;HotSauce&lt;Lemon&gt;&gt;</code>, and so forth ad
infinitum. We address this by the same kind of tweak. After a point,
if a query grows too large, we can just truncate it into a shorter
one<sup id="fnref:8"><a href="#fn:8" class="footnote-ref" role="doc-noteref">8</a></sup>. So e.g. trying to solve</p>
<pre><code>exists&lt;T&gt; HotSauce&lt;HotSauce&lt;HotSauce&lt;HotSauce&lt;T&gt;&gt;&gt;&gt;: Foo
</code></pre>
<p>with a maximum size of 3 would wind up &ldquo;redirecting&rdquo; to the query</p>
<pre><code>exists&lt;T&gt; HotSauce&lt;HotSauce&lt;HotSauce&lt;T&gt;&gt;&gt;: Foo
</code></pre>
<p>Interestingly, unlike the &ldquo;answer approximation&rdquo; we did earlier,
redirecting queries like this doesn&rsquo;t produce imprecision (at least
not on its own). The new query is a generalization of the old query,
and since we generate <strong>all</strong> answers to any given query, we will find
the original answers we were looking for (and then some more). Indeed,
if we try to perform this query with the SLG solver, it correctly
reports that there exists no answer (because this recursion will never
terminate):</p>
<pre tabindex="0"><code>&gt; cargo run -- --program=sour-sweet.chalk --slg --overflow-depth=3
?- Lemon: Foo
No answers found.
</code></pre><p>(The original solver panics with an overflow error.)</p>
<h3 id="tabling">Tabling</h3>
<p>The key idea of <strong>tabling</strong> is to keep, for each query that we are
trying to solve, a table of answers that we build up over
time. Tabling came up in my <a href="http://smallcultfollowing.com/babysteps/blog/2017/09/12/tabling-handling-cyclic-queries-in-chalk/">previous post</a>, too, where I
discussed how we used it to handle cyclic queries in the current
solver. But the integration into SLG is much deeper.</p>
<p>In SLG, we wind up keeping a table for <strong>every</strong> subgoal that we
encounter. Thus, any time that you have to solve the same subgoal
twice in the course of a query, you automatically get to take
advantage of the cached answers from the previous attempt. Moreover,
to account for cyclic dependencies, tables can be linked together, so
that as new answers are found, the suspended queries are re-awoken.</p>
<p>Tables can be in one of two states:</p>
<ul>
<li><strong>Completed:</strong> we have already found all the answers for this query.</li>
<li><strong>Incomplete:</strong> we have not yet found all the answers, but we may have found some of them.</li>
</ul>
<p>By the time the SLG processing is done, all tables will be in a
completed state, and thus they serve purely as caches. These tables
can also be remembered for use in future queries. I think integrating
this kind of caching into rustc could be a tremendous performance
enhancement.</p>
<h4 id="variant--versus-subsumption-based-tabling">Variant- versus subsumption-based tabling</h4>
<p>I implemented &ldquo;variant-based tabling&rdquo; &ndash; in practical terms, this
means that whenever we have some subgoal <code>G</code> that we want to solve, we
first convert it into a canonical form. So imagine that we are in some
inference context and <code>?T</code> is a variable in that context, and we want
to solve <code>HotSauce&lt;?T&gt;: Sour</code>. We would replace that variable <code>?T</code> with <code>?0</code>,
since it is the first variable we encountered as we traversed the type,
thus giving us a canonical query like:</p>
<pre><code>HotSauce&lt;?0&gt;: Sour
</code></pre>
<p>This is then the key that we use to lookup if there exists a table
already. If we do find such a table, it will have a bunch of answers; these
answers are in the form of substitutions, like</p>
<ul>
<li><code>?0 := Lemon</code></li>
<li><code>?0 := Vinegar</code></li>
</ul>
<p>and so forth. At this point, this should start looking familiar: you
may recall that earlier in the post I was showing you the output from
the chalk repl, which consisted of stuff like this:</p>
<pre tabindex="0"><code>&gt; cargo run -- --program=sour-sweet.chalk --slg
?- exists&lt;T&gt; { T: Sour, T: Sweet }     
1 answer(s) found:
- ?0 := Lemon
</code></pre><p>This printout is exactly dumping the contents of the table that we
constructed for our <code>exists&lt;T&gt; { T: Sour, T: Sweet }</code> query. That
query would be canonicalized to <code>?0: Sour, ?0: Sweet</code>, and hence we
have results in terms of this canonical variable <code>?0</code>.</p>
<p>However, this form of tabling that I just described has its
limitations. For example, imagine that I we have the table for
<code>exists&lt;T&gt; { T: Sour, T: Sweet }</code> all setup, but then I do a query
like <code>Lemon: Sour, Lemon: Sweet</code>. In the solver as I wrote it today,
this will create a brand new table and begin computation again.  This
is somewhat unfortunate, particularly for a setting like rustc, where
we often solve queries first in the generic form (during
type-checking) and then later, during trans, we solve them again for
specific instantiations.</p>
<p>The <a href="https://link.springer.com/chapter/10.1007/3-540-48159-1_12">paper about SLG that I pointed you at earlier</a> describes an
alternative approach called &ldquo;subsumption-based tabling&rdquo;, in which you
can reuse a table&rsquo;s results even if it is not an exact match for the
query you are doing. This extension is not <em>too</em> difficult, and we
could consider doing something similar, though we&rsquo;d have to do some
more experiments to decide if it pays off.</p>
<p>(In rustc, for example, subsumption-based tabling might not help us
that much; the queries that we perform at trans time are often not the
same as the ones we perform during type-checking. At trans time, we
are required to &ldquo;reveal&rdquo; specialized types and take advantage of other
details that type-checking does not do, so the query results are
somewhat different.)</p>
<h3 id="negative-reasoning-and-the-well-founded-semantics">Negative reasoning and the well-founded semantics</h3>
<p>One last thing that the SLG solver handles quite well is negative
reasoning. In coherence &ndash; and maybe elsewhere in Rust &ndash; we want to
be able to support <strong>negative</strong> queries, such as:</p>
<pre><code>not { exists&lt;T&gt; { Vec&lt;T&gt;: Foo } }
</code></pre>
<p>This would assert that there is <strong>no type</strong> <code>T</code> for which <code>Vec&lt;T&gt;: Foo</code> is implemented. In the SLG solver, this is handled by creating a
table for the positive query (<code>Vec&lt;?0&gt;: Foo</code>) and letting that
execute. Once it completes, we can check whether the table has any
answers or not.</p>
<p>There are some edge cases to be careful of though. If you start to
allow negative reasoning to be used more broadly, there are logical
pitfalls that start to arise. Consider the following Rust impls, in a
system where we supported negative goals:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Bar</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">T</span>: <span class="o">!</span><span class="n">Bar</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Bar</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">T</span>: <span class="o">!</span><span class="n">Foo</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now consider the question of whether some type <code>T</code> implements <code>Foo</code>
and <code>Bar</code>. The trouble with these two impls is that the answers to
these two queries (<code>T: Foo</code>, <code>T: Bar</code>) are no longer independent from
one another. We could say that <code>T: Foo</code> holds, but then <code>T: Bar</code> does
not (because <code>T: !Foo</code> is false). Alternatively, we could say that <code>T: Bar</code> holds, but then <code>T: Foo</code> does not (because <code>T: !Bar</code> is
false). How is the compiler to choose?</p>
<p>The SLG solver chooses not to choose. It is based on the
<strong>well-founded semantics</strong>, which ultimately assigns one of three
results to every query: true, false, or unknown. In the case of
negative cycles like the one above, the answer is &ldquo;unknown&rdquo;.</p>
<p>(In contrast, our current solver will answer that both <code>T: Foo</code> and
<code>T: Bar</code> are false, which is clearly wrong. I imagine we could fix
this &ndash; it was an interaction we did not account for in our naive
tabling implementation.)</p>
<h3 id="extensions-and-future-work">Extensions and future work</h3>
<p>The SLG papers themselves describe a fairly basic set of logic
programs. These do not include a number of features that we need to
model Rust. My current solver already extends the SLG work to cover
first-order hereditary harrop clauses (meaning the ability to have
queries like <code>forall&lt;T&gt; { if (T: Clone) { ... } }</code>) &ndash; this was
relatively straight-forward. But I did not yet cover some of the other
things that the current solver handles:</p>
<ul>
<li>Coinductive predicates: To handle auto traits, we need to support coinductive
predicates like <code>Send</code>. I am not sure yet how to extend SLG to handle this.</li>
<li>Fallback clauses: If you normalize something like <code>&lt;Vec&lt;u32&gt; as IntoIterator&gt;::Item</code>,
the correct result is <code>u32</code>. The SLG solver gives back two answers, however: <code>u32</code>
or the unnormalized form <code>&lt;Vec&lt;u32&gt; as IntoIterator&gt;::Item</code>. This is not <em>wrong</em>,
but the current solver understands that one answer is &ldquo;better&rdquo; than the other.</li>
<li>Suggested advice: in cases of ambiguity, the current solver knows to privilege where
clauses and can give &ldquo;suggestions&rdquo; for how to unify variables based on those.</li>
</ul>
<p>The final two points I think can be done in a fairly trivial fashion,
though the full implications of fallback clauses may require some
careful thought, but coinductive predicates seem a bit harder and may require some
deeper tinkering.</p>
<h3 id="conclusions">Conclusions</h3>
<p>I&rsquo;m pretty excited about this new SLG-based solver. I think it is a
big improvement over the existing solver, though we still have to work
out the story for auto traits. The things that excited me the most:</p>
<ul>
<li>The deeply integrated use of tabling offers a very strong caching story.</li>
<li>There is a lot of work on efficienctly executing the SLG solving algorithm.
The work I did is only the tip of the iceberg: there are existing virtual machine
designs and other things that we could adapt if we wanted to.</li>
</ul>
<p>I am also quite keen on the story around guaranteed termination. I
like that it does not involve a concept of <strong>overflow</strong> &ndash; that is, a
hard limit on the depth of the query stack &ndash; but rather simply a
<strong>maximum size imposed on types</strong>. The problem with overflow is that
it means that the results of queries wind up dependent on where they
were executed, complicating caching and other things. In other words,
a query that may well succeed can wind up failing just because it was
executed as part of something else. This does not happen with the
SLG-based solver &ndash; queries always succeed or fail in the same way.</p>
<p>However, I am also worried &ndash; most notably about the fact that the
current solver is designed to <strong>always</strong> enumerate all the answers to
a query, even when that is unhelpful. I worry that this may waste a
ton of memory in rustc processes, as we are often asked to solve silly
queries like <code>?T: Sized</code> during type-checking, which would basically
wind up enumerating nearly all types in the system up to the maximum
size[^ms]. Still, I am confident that we can find ways to address this
shortcoming in time, possibly without deep changes to the algorithm.</p>
<h3 id="credit-where-credit-is-due">Credit where credit is due</h3>
<p>I also want to make sure I thank all the authors of the many papers on
SLG whose work I gleefully <del>stole</del> built upon. This is a list of the
papers that I papers that described techniques that went into the new
solver, in no particular order; I&rsquo;ve tried to be exhaustive, but if I
forgot something, I&rsquo;m sorry about that.</p>
<ul>
<li><a href="https://ac.els-cdn.com/0743106694000285/1-s2.0-0743106694000285-main.pdf?_tid=f8beb358-b642-11e7-b052-00000aacb35f&amp;acdnat=1508578621_12290e1834d94c48d36219f58be6e87f">Efficient Top-Down Computation of Queries Under the Well-formed Semantics</a>
<ul>
<li>Chen, Swift, and Warren; JLP &lsquo;95.</li>
<li>The specific solution strategy for SLG that I used.</li>
</ul>
</li>
<li><a href="https://link.springer.com/chapter/10.1007/3-540-48159-1_12">A New Formulation of Tabled resolution With Delay</a>
<ul>
<li>Swift; EPIA &lsquo;99</li>
<li>Describes SLG in the abstract.</li>
</ul>
</li>
<li>[Terminating Evaluation of Logic Programs with Finite Three-Valued Models][&ldquo;subgoal abstraction&rdquo;]
<ul>
<li>Riguzzi and Swift; ACM Transactions on Computational Logic 2013</li>
<li>Describes approximating subgoals.</li>
</ul>
</li>
<li><a href="https://www.researchgate.net/publication/220986525_OLD_Resolution_with_Tabulation">OLD Resolution with Tabulation</a>
<ul>
<li>Tamaki and Sato 86</li>
<li>Describes approximating subgoals.</li>
</ul>
</li>
<li>[Radial Restraint][&ldquo;radial restraint&rdquo;]
<ul>
<li>Grosof and Swift; 2013</li>
<li>Describes approximating answers.</li>
</ul>
</li>
<li><a href="http://www.sciencedirect.com/science/article/pii/074310669500037K">Scoping constructs in logic programming: Implementation problems and their solution</a>
<ul>
<li>Nadathur, Jayaraman, Kwon; JLP &lsquo;95.</li>
<li>Describes how to integrate first-order hereditary harrop clauses into logic programming.</li>
</ul>
</li>
</ul>
<h3 id="footnotes">Footnotes</h3>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>True confessions: I have never (personally) managed to make a non-trivial Prolog program terminate. I understand it can be done. Just not by me.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Assuming termination. More on that later.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Some might say that lemons are not, in fact, sweet. Well fooey. I&rsquo;m not rewriting this blog post now, dang it.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>Try <a href="https://store.davesgourmet.com/ProductDetails.asp?ProductCode=DAIN">this stuff</a>, it&rsquo;s for real.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>This technique is called [&ldquo;radial restraint&rdquo;] by its authors.
[&ldquo;radial restraint&rdquo;]: <a href="http://www3.cs.stonybrook.edu/~tswift/webpapers/aaai-13.pdf">http://www3.cs.stonybrook.edu/~tswift/webpapers/aaai-13.pdf</a>&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p>In terms of the well-formed semantics that we&rsquo;ll discuss later, its truth value is considered &ldquo;unknown&rdquo;.&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:7">
<p>Actually, in the course of writing this blog post, I found I sometimes only see 5 answers, so YMMV. Some kind of bug I suppose. (Update: fixed it.)&#160;<a href="#fnref:7" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:8">
<p>This technique is called [&ldquo;subgoal abstraction&rdquo;] by its authors.
[&ldquo;subgoal abstraction&rdquo;]: <a href="http://www3.cs.stonybrook.edu/~tswift/webpapers/tocl-14.pdf">http://www3.cs.stonybrook.edu/~tswift/webpapers/tocl-14.pdf</a>&#160;<a href="#fnref:8" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/traits" term="traits" label="Traits"/><category scheme="https://smallcultfollowing.com/babysteps/categories/chalk" term="chalk" label="Chalk"/><category scheme="https://smallcultfollowing.com/babysteps/categories/pl" term="pl" label="PL"/></entry><entry><title type="html">Cyclic queries in chalk</title><link href="https://smallcultfollowing.com/babysteps/blog/2017/09/12/tabling-handling-cyclic-queries-in-chalk/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2017/09/12/tabling-handling-cyclic-queries-in-chalk/</id><published>2017-09-12T00:00:00+00:00</published><updated>2017-09-12T00:00:00+00:00</updated><content type="html"><![CDATA[<p>In my <a href="https://smallcultfollowing.com/babysteps/
/blog/2017/05/25/query-structure-in-chalk/">last post about chalk queries</a>, I discussed how the query
model in chalk. Since that writing, there have been some updates, and
I thought it&rsquo;d be nice to do a new post covering the current model.
This post will also cover the tabling technique that <a href="https://github.com/scalexm/">scalexm</a>
implemented for handling cyclic relations and show how that enables us
to implement implied bounds and other long-desired features in an
elegant way. (Nice work, scalexm!)</p>
<h3 id="what-is-a-chalk-query">What is a chalk query?</h3>
<p>A <strong>query</strong> is simply a question that you can ask chalk. For example,
we could ask whether <code>Vec&lt;u32&gt;</code> implements <code>Clone</code> like so (this is a
transcript of a <code>cargo run</code> session in chalk):</p>
<pre tabindex="0"><code>?- load libstd.chalk
?- Vec&lt;u32&gt;: Clone
Unique; substitution [], lifetime constraints []
</code></pre><p>As we&rsquo;ll see in a second, the answer &ldquo;Unique&rdquo; here is basically
chalk&rsquo;s way of saying &ldquo;yes, it does&rdquo;. Sometimes chalk queries can
contain <strong>existential variables</strong>. For example, we might say
<code>exists&lt;T&gt; { Vec&lt;T&gt;: Clone }</code> &ndash; in this case, chalk actually attempts
to not only tell us <em>if</em> there exists a type <code>T</code> such that <code>Vec&lt;T&gt;: Clone</code>, it also wants to tell us what <code>T</code> must be:</p>
<pre tabindex="0"><code>?- exists&lt;T&gt; { Vec&lt;T&gt;: Clone }
Ambiguous; no inference guidance
</code></pre><p>The result &ldquo;ambiguous&rdquo; is chalk&rsquo;s way of saying &ldquo;probably it does, but
I can&rsquo;t say for sure until you tell me what <code>T</code> is&rdquo;.</p>
<p>So you think can think of a chalk query as a kind of subroutine
like <code>Prove(Goal) = R</code> that evaluates some <em>goal</em> (the query) and returns
a result R which has one of the following forms:</p>
<ul>
<li><strong>Unique:</strong> indicates that the query is provable and there is a unique
value for all the existential variables.
<ul>
<li>In this case, we give back a <strong>substitution</strong> saying what each existential
variable had to be.</li>
<li>Example: <code>exists&lt;T&gt; { usize: PartialOrd&lt;T&gt; }</code> would yield unique
and return a substitution that <code>T = usize</code>, at least today (since
there is only one impl that could apply, and we haven&rsquo;t
implemented the open world modality that
<a href="http://aturon.github.io/blog/2017/04/24/negative-chalk/">aturon talked about</a> yet).</li>
</ul>
</li>
<li><strong>Ambiguous:</strong> the query <em>may</em> hold but we could not be sure. Typically,
this means that there are multiple possible values for the
existential variables.
<ul>
<li>Example: <code>exists&lt;T&gt; { Vec&lt;T&gt;: Clone }</code> would yield ambiguous,
since there are many <code>T</code> that could fit the bill).</li>
<li>In this case, we sometimes give back <strong>guidance</strong>, which are suggested
values for the existential variables. This is not important to this blog post
so I&rsquo;ll not go into the details.</li>
</ul>
</li>
<li><strong>Error:</strong> the query is provably false.</li>
</ul>
<p>(The form of these answers has changed somewhat since my previous blog
post, because we incorporated some of
<a href="http://aturon.github.io/blog/2017/04/24/negative-chalk/">aturon&rsquo;s ideas around negative reasoning</a>.)</p>
<h3 id="so-what-is-a-cycle">So what is a cycle?</h3>
<p>As I outlined long ago in my first post on
<a href="https://smallcultfollowing.com/babysteps/
/blog/2017/01/26/lowering-rust-traits-to-logic/">lowering Rust traits to logic</a>, the way that the <code>Prove(Goal)</code>
subroutine works is basically just to iterate over all the possible
ways to prove the given goal and try them one at a time. This often
requires proving subgoals: for example, when we were evaluating <code>?- Vec&lt;u32&gt;: Clone</code>, internally, this would also wind up evaluating <code>u32: Clone</code>, because the impl for <code>Vec&lt;T&gt;</code> has a where-clause that <code>T</code> must
be clone:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Clone</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Vec</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">where</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">T</span>: <span class="nb">Clone</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">T</span>: <span class="nb">Sized</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Sometimes, this exploration can wind up trying to solve the same goal
that you started with! The result is a <strong>cyclic query</strong> and,
naturally, it requires some special care to yield a valid answer. For
example, consider this setup:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">S</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">U</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">S</span><span class="o">&lt;</span><span class="n">U</span><span class="o">&gt;</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">U</span>: <span class="nc">Foo</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now imagine that we were evaluating <code>exists&lt;T&gt; { T: Foo }</code>:</p>
<ul>
<li>Internally, we would process this by first instantiating the
existential variable <code>T</code> with an inference variable, so we wind up
with something like <code>?0: Foo</code>, where <code>?0</code> is an as-yet-unknown
inference variable.</li>
<li>Then we would consider each impl: in this case, there is only one.
<ul>
<li>For that impl to apply, <code>?0 = S&lt;?1&gt;</code> must hold, where <code>?1</code> is a
new variable. So we can perform that unification.
<ul>
<li>But next we must check that <code>?1: Foo</code> holds (that is the
where-clause on the impl). So we would convert this into &ldquo;closed&rdquo; form
by replacing all the inference variables with <code>exists</code> binders, giving us
something like <code>exists&lt;T&gt; { T: Foo }</code>. We can now perform this query.
<ul>
<li>Only wait: This is the same query we were <em>already</em> trying to
solve! This is precisely what we mean by a <strong>cycle</strong>.</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
</ul>
<p>In this case, the <em>right</em> answer for chalk to give is actually <code>Error</code>.
This is because there is no <strong>finite</strong> type that satisfies this query.
The only type you could write would be something like</p>
<pre><code>S&lt;S&lt;S&lt;S&lt;...ad infinitum...&gt;&gt;&gt;&gt;: Foo
</code></pre>
<p>where there are an infinite number of nesting levels. As Rust requires
all of its types to have finite size, this is not a legal type. And
indeed if we ask chalk this query, that is precisely what it answers:</p>
<pre tabindex="0"><code>?- exists&lt;T&gt; { S&lt;T&gt;: Foo }
No possible solution: no applicable candidates
</code></pre><p>But cycles aren&rsquo;t <em>always</em> errors of this kind. Consider a variation
on our previous example where we have a few more impls:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// chalk doesn&#39;t have built-in knowledge of any types,
</span></span></span><span class="line"><span class="cl"><span class="c1">// so we have to declare `u32` as well:
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="kt">u32</span> <span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="kt">u32</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">S</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">U</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">S</span><span class="o">&lt;</span><span class="n">U</span><span class="o">&gt;</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">U</span>: <span class="nc">Foo</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now if we ask the same query, we get back an <strong>ambiguous</strong> result,
meaning that there exists many solutions:</p>
<pre tabindex="0"><code>?- exists&lt;T&gt; { T: Foo }
Ambiguous; no inference guidance
</code></pre><p>What has changed here? Well, introducing the new impl means that there
is now an infinite family of finite solutions:</p>
<ul>
<li><code>T = u32</code> would work</li>
<li><code>T = S&lt;u32&gt;</code> would work</li>
<li><code>T = S&lt;S&lt;u32&gt;&gt;</code> would work</li>
<li>and so on.</li>
</ul>
<p>Sometimes there can even be <em>unique</em> solutions. For example, consider
this final twist on the example, where we add a second where-clause
concerning <code>Bar</code> to the impl for <code>S&lt;T&gt;</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Bar</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="kt">u32</span> <span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="kt">u32</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">S</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">U</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">S</span><span class="o">&lt;</span><span class="n">U</span><span class="o">&gt;</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">U</span>: <span class="nc">Foo</span><span class="p">,</span><span class="w"> </span><span class="n">U</span>: <span class="nc">Bar</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">//                                 ^^^^^^ this is new
</span></span></span></code></pre></div><p>Now if we ask the same query again, we get back yet a different response:</p>
<pre tabindex="0"><code>?- exists&lt;T&gt; { T: Foo }
Unique; substitution [?0 := u32], lifetime constraints []
</code></pre><p>Here, Chalk figured out that <code>T</code> must be <code>u32</code>. How can this be? Well,
if you look, it&rsquo;s the only impl that can apply &ndash; for <code>T</code> to equal
<code>S&lt;U&gt;</code>, <code>U</code> must implement <code>Bar</code>, and there are no <code>Bar</code> impls at all.</p>
<p>So we see that when we encounter a cycle during query processing, it
doesn&rsquo;t necessarily mean the query needs to result in an
error. Indeed, the overall query may result in zero, one, or many
solutions. But how does should we figure out what is right? And how do
we avoid recursing infinitely while doing so? Glad you asked.</p>
<h3 id="tabling-how-chalk-is-handling-cycles-right-now">Tabling: how chalk is handling cycles right now</h3>
<p>Naturally, traditional Prolog interpreters have similar problems. It
is actually quite easy to make a Prolog program spiral off into an
infinite loop by writing what <em>seem</em> to be quite reasonable clauses
(quite like the ones we saw in the previous section). Over time,
people have evolved various techniques for handling this. One that is
relevant to us is called <strong>tabling</strong> or <strong>memoization</strong> &ndash; I found
<a href="http://www.public.asu.edu/~dietrich/publications/ExtensionTablesMemoRelations.pdf">this paper</a> to be a particularly readable
introduction. As part of his work on implied bounds,
<a href="https://github.com/scalexm/">scalexm</a> implemented a variant of this idea in chalk.</p>
<p>The basic idea is as follows. When we encounter a cycle, we will
actually wind up <strong>iterating</strong> to find the result. Initially, we
assume that a cycle means an error (i.e., no solutions). This will
cause us to go on looking for other impls that may apply <strong>without</strong>
encountering a cycle. Let&rsquo;s assume we find some solution S that
way. Then we can start over, but this time, when we encounter the
cyclic query, we can use S as the result of the cycle, and we would
then check if that gives us a new solution S'.</p>
<p>If you were doing this in Prolog, where the interpreter attempts to
provide <strong>all</strong> possible answers, then you would keep iterating, only
this time, when you encountered the cycle, you would give back two
answers: S and S&rsquo;. In chalk, things are somewhat simpler: multiple
answers simply means that we give back an ambiguous result.</p>
<p>So the pseudocode for solving then looks something like this:</p>
<ul>
<li>Prove(Goal):
<ul>
<li>If goal is ON the stack already:
<ul>
<li>return stored answer from the stack</li>
</ul>
</li>
<li>Else, when goal is not on the stack:
<ul>
<li>Push goal on to the stack with an initial answer of <strong>error</strong></li>
<li>Loop
<ul>
<li>Try to solve goal yielding result R (which may generate recursive calls to Solve with the same goal)</li>
<li>Pop goal from the stack and return the result R if any of the following are true:
<ul>
<li>No cycle was encountered; or,</li>
<li>the result was the same as what we started with; or,</li>
<li>the result is ambiguous (multiple solutions).</li>
</ul>
</li>
<li>Otherwise, set the answer for Goal to be R and repeat.</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
</ul>
<p>If you&rsquo;re curious, the <a href="https://github.com/nikomatsakis/chalk/blob/7eb0f085b86986159097da1cb34dc065f2a6c8cd/src/solve/solver.rs#L122-L248">real chalk code is here</a>. It is pretty
similar to what I wrote above, except that it also handles
&ldquo;coinductive matching&rdquo; for auto traits, which I won&rsquo;t go into now. In
any case, let&rsquo;s apply this to our three examples of proving <code>exists&lt;T&gt; { T: Foo }</code>:</p>
<ul>
<li>In the first example, where we only had <code>impl&lt;U&gt; Foo for S&lt;U&gt; where U: Foo</code>, the cyclic attempt to solve will yield an error (because
the initial answer for cyclic alls is errors). There is no other way
for a type to implement <code>Foo</code>, and hence the overall attempt to
solve yields an error. This is the same as what we started with, so
we just return and we don&rsquo;t have to cycle again.</li>
<li>In the second example, where we added <code>impl Foo for u32</code>, we again
encounter a cycle and return error at first, but then we see that <code>T = u32</code> is a valid solution. So our initial result R is
<code>Unique[T = u32]</code>. This is not what we started with, so we try
again.
<ul>
<li>In the second iteration, when we encounter the cycle trying to
process <code>impl&lt;U&gt; Foo for S&lt;U&gt; where U: Foo</code>, this time we will
give back the answer <code>U = u32</code>. We will then process the
where-clause and issue the query <code>u32: Foo</code>, which succeeds.  Thus
we wind up yielding a successful possibility, where <code>T = S&lt;u32&gt;</code>,
in addition to the result that <code>T = u32</code>. This means that,
overall, our second iteration winds up producing ambiguity.</li>
</ul>
</li>
<li>In the final example, where we added a where clause <code>U: Bar</code>,
the first iteration will again produce a result of <code>Unique[T = u32]</code>.
As this is not what we started with, we again try a second iteration.
<ul>
<li>In the second iteration, we will again produce <code>T = u32</code> as a result
for the cycle. This time however we go on to evaluate <code>u32: Bar</code>,
which fails, and hence overall we still only get one successful
result (<code>T = u32</code>).</li>
<li>Since we have now reached a fixed point, we stop processing.</li>
</ul>
</li>
</ul>
<h3 id="why-do-we-care-about-cycles-anyway">Why do we care about cycles anyway?</h3>
<p>You may wonder why we&rsquo;re so interested in handling cycles well. After
all, how often do they arise in practice? Indeed, today&rsquo;s rustc takes
a rather more simplistic approach to cycles. However, this leads to a
number of limitations where rustc fails to prove things that it ought
to be able to do. As we were exploring ways to overcome these
obstacles, as well as integrating ideas like implied bounds, we found
that a proper handling of cycles was crucial.</p>
<p>As a simple example, consider how to handle &ldquo;supertraits&rdquo; in Rust. In
Rust today, traits sometimes have supertraits, which are a subset of their
ordinary where-clauses that apply to <code>Self</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// PartialOrd is a &#34;supertrait&#34; of Ord. This means that
</span></span></span><span class="line"><span class="cl"><span class="c1">// I can only implement `Ord` for types that also implement
</span></span></span><span class="line"><span class="cl"><span class="c1">// `PartialOrd`.
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="nb">Ord</span>: <span class="nb">PartialOrd</span> <span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>As a result, whenever I have a function that requires <code>T: Ord</code>, that
implies that <code>T: PartialOrd</code> must also hold:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Ord</span><span class="o">&gt;</span><span class="p">(</span><span class="n">t</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">bar</span><span class="p">(</span><span class="n">t</span><span class="p">);</span><span class="w"> </span><span class="c1">// OK: `T: Ord` implies `T: PartialOrd`
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">  
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">bar</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">PartialOrd</span><span class="o">&gt;</span><span class="p">(</span><span class="n">t</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">  
</span></span></span></code></pre></div><p>The way that we handle this in the Rust compiler is through a
technique called <strong>elaboration</strong>. Basically, we start out with a base
set of where-clauses (the ones you wrote explicitly), and then we grow
that set, adding in whatever supertraits should be implied. This is an
iterative process that repeats until a fixed-point is reached. So the
internal set of where-clauses that we use when checking <code>foo()</code> is not
<code>{T: Ord}</code> but <code>{T: Ord, T: PartialOrd}</code>.</p>
<p>This is a simple technique, but it has some limitations. For example,
<a href="https://github.com/rust-lang/rfcs/pull/1927">RFC 1927</a> proposed that
we should elaborate not only <em>supertraits</em> but arbitrary where-clauses
declared on traits (in general, a
<a href="https://github.com/rust-lang/rust/issues/20671">common request</a>). Going
further, we have ideas like the
<a href="https://github.com/rust-lang/rfcs/pull/2089">implied bounds RFC</a>.
There are also just known limitations around associated types and
elaboration.</p>
<p>The problem is that the elaboration technique doesn&rsquo;t really scale
gracefully to all of these proposals: often times, the fully
elaborated set of where-clauses is infinite in size. (We somewhat
arbitrarily prevent cycles between supertraits to prevent this
scenario in that special case.)</p>
<p>So we tried in chalk to take a different approach. Instead of doing
this iterative elaboration step, we
<a href="https://github.com/nikomatsakis/chalk/issues/12#issuecomment-286728215">push that elaboration into the solver via special rules</a>.
The basic idea is that we have a special kind of predicate called a
<code>WF</code> (well-formed) goal. The meaning of something like <code>WF(T: Ord)</code> is
basically &ldquo;<code>T</code> is <em>capable</em> of implementing <code>Ord</code>&rdquo; &ndash; that is, <code>T</code>
satisfies the conditions that would make it legal to implement
<code>Ord</code>. (It doesn&rsquo;t mean that <code>T</code> actually <em>does</em> implement <code>Ord</code>; that
is the predicate <code>T: Ord</code>.) As we lower the <code>Ord</code> and <code>PartialOrd</code> traits
to simpler logic rules, then, we can define the <code>WF(T: Ord)</code> predicate like so:</p>
<pre tabindex="0"><code>// T is capable of implementing Ord if...
WF(T: Ord) :-
  T: PartialOrd. // ...T implements PartialOrd.
</code></pre><p>Now, <code>WF(T: Ord)</code> is really an &ldquo;if and only if&rdquo; predicate. That is,
there is only one way for <code>WF(T: Ord)</code> to be true, and that is by
implementing <code>PartialOrd</code>. Therefore, we can define also the <em>opposite</em>
direction:</p>
<pre tabindex="0"><code>// T must implement PartialOrd if...
T: PartialOrd :-
  WF(T: Ord). // ...T is capable of implementing Ord.
</code></pre><p>Now if you think this looks cyclic, you&rsquo;re right! Under ordinary
circumstances, this pair of rules doesn&rsquo;t do you much good. That is,
you can&rsquo;t prove that (say) <code>u32: PartialOrd</code> by using these rules, you
would have to use other rules for that (say, rules arising from an
impl).</p>
<p>However, sometimes these rules <em>are</em> useful. In particular, if you have
a generic function like the function <code>foo</code> we saw before:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Ord</span><span class="o">&gt;</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>In this case, we would setup the environment of <code>foo()</code> to contain
exactly two predicates <code>{T: Ord, WF(T: Ord)}</code>. This is a form of
elaboration, but not the iterative elaboration we had before. We
simply introduce <code>WF</code>-clauses.  But this gives us enough to prove that
<code>T: PartialOrd</code> (because we know, by assumption, that <code>WF(T: Ord)</code>).
What&rsquo;s more, this setup scales to arbitrary where-clauses and other
kinds of implied bounds.</p>
<h3 id="conclusion">Conclusion</h3>
<p>This post covers the tabling technique that chalk currently uses to
handle cycles, and also the key ideas of how Rust handles elaboration.</p>
<p>The current implementation in chalk is really quite naive. One
interesting question is how to make it more efficient. There is a lot
of existing work on this topic from the Prolog community, naturally,
with the work on the well-founded semantics being among the most
promising (see e.g. <a href="http://www.sciencedirect.com/science/article/pii/0743106694000285">this paper</a>). I started doing some
prototyping in this direction, but I&rsquo;ve recently become intrigued with
a different approach, where we use the techniques from <a href="http://adapton.org/">Adapton</a> (or
perhaps other incremental computation systems) to enable fine-grained
caching and speed up the more naive implementation. Hopefully this
will be the subject of the next blog post!</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/chalk" term="chalk" label="Chalk"/><category scheme="https://smallcultfollowing.com/babysteps/categories/traits" term="traits" label="Traits"/></entry><entry><title type="html">Non-lexical lifetimes: draft RFC and prototype available</title><link href="https://smallcultfollowing.com/babysteps/blog/2017/07/11/non-lexical-lifetimes-draft-rfc-and-prototype-available/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2017/07/11/non-lexical-lifetimes-draft-rfc-and-prototype-available/</id><published>2017-07-11T00:00:00+00:00</published><updated>2017-07-11T00:00:00+00:00</updated><content type="html"><![CDATA[<p>I&rsquo;ve been hard at work the last month or so on trying to complete the
non-lexical lifetimes RFC. I&rsquo;m pretty excited about how it&rsquo;s shaping
up. I wanted to write a kind of &ldquo;meta&rdquo; blog post talking about the
current state of the proposal &ndash; almost there! &ndash; and how you could
get involved with helping to push it over the finish line.</p>
<h3 id="tldr">TL;DR</h3>
<p>What can I say, I&rsquo;m loquacious! In case you don&rsquo;t want to read the
full post, here are the highlights:</p>
<ul>
<li>The NLL proposal is looking good. As far as I know, the proposal
covers all major <strong>intraprocedural</strong> shortcomings of the existing
borrow checker. The appendix at the end of this post talks about the
problems that we <strong>don&rsquo;t</strong> address (yet).</li>
<li>The draft RFC is <a href="https://github.com/nikomatsakis/nll-rfc/">available in a GitHub repository</a>:
<ul>
<li>Read it over! Open issues! Open PRs!</li>
<li>In particular, if there is some pattern you think may not be
covered, please let me know about it by opening an issue.</li>
</ul>
</li>
<li>There is a <a href="https://github.com/nikomatsakis/nll/">working prototype as well</a>:
<ul>
<li>The prototype includes region inference as well as the borrow
checker.</li>
<li>I hope to expand it to become the normative prototype of how the
borrow checker works, allowing us to easily experiment with
extensions and modifications &ndash; analogous to Chalk.</li>
</ul>
</li>
</ul>
<h3 id="background-what-the-proposal-aims-to-fix">Background: what the proposal aims to fix</h3>
<p>The goal of this proposal is to fix the <strong>intra-procedural</strong>
shortcomings of the existing borrow checker. That is, to fix those
cases where, without looking at any other functions or knowing
anything about what they do, we can see that some function is safe.
The core of the proposal is the idea of defining reference lifetimes
in terms of the control-flow graph, as I discussed (over a year ago!)
in my <a href="https://smallcultfollowing.com/babysteps/
/blog/2016/04/27/non-lexical-lifetimes-introduction/">introductory blog post;</a> but that alone isn&rsquo;t enough to
address some common annoyances, so I&rsquo;ve grown the proposal somewhat.
In addition to defining how to infer and define non-lexical lifetimes
themselves, it now includes an improved definition of the Rust borrow
checker &ndash; that is, how to decide <strong>which loans are in scope</strong> at any
particular point and <strong>which actions are illegal as a result</strong>.</p>
<p>When combined with <a href="https://github.com/rust-lang/rfcs/pull/2025">RFC 2025</a>, this means that we will accept
two more classes of programs. First, what I call &ldquo;nested method calls&rdquo;:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">add</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">value</span>: <span class="nc">Point</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">compute</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Point</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">process</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">self</span><span class="p">.</span><span class="n">add</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">compute</span><span class="p">());</span><span class="w"> </span><span class="c1">// Error today! But not with RFC 2025.
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Second, what I call &ldquo;reference overwrites&rdquo;. Currently, the borrow
checker forbids you from writing code that updates an <code>&amp;mut</code> variable
whose referent is borrowed. This most commonly shows up when iterating
down a slice in place (<a href="https://is.gd/FumP9w">try it on play</a>):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">search</span><span class="p">(</span><span class="k">mut</span><span class="w"> </span><span class="n">data</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="p">[</span><span class="n">Data</span><span class="p">])</span><span class="w"> </span>-&gt; <span class="kt">bool</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">loop</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">((</span><span class="n">first</span><span class="p">,</span><span class="w"> </span><span class="n">tail</span><span class="p">))</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">data</span><span class="p">.</span><span class="n">split_first_mut</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="k">if</span><span class="w"> </span><span class="n">is_match</span><span class="p">(</span><span class="n">first</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">return</span><span class="w"> </span><span class="kc">true</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">tail</span><span class="p">;</span><span class="w"> </span><span class="c1">// Error today! But not with the NLL proposal.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="k">return</span><span class="w"> </span><span class="kc">false</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The problem here is that the current borrow checker sees that
<code>data.split_first_mut()</code> borrows <code>*data</code> (which has type
<code>[Data]</code>). Normally, when you borrow some path, then all prefixes of
the path become immutable, and hence borrowing <code>*data</code> means that,
later on, modifying <code>data</code> in <code>data = tail</code> is illegal. This rule
makes sense for &ldquo;interior&rdquo; data like fields: if you&rsquo;ve borrowed the
field of a struct, then overwriting the struct itself will also
overwrite the field. But the rule is too strong for references and
indirection: if you overwrite an <code>&amp;mut</code>, you don&rsquo;t affect the data it
refers to. You can workaround this problem by forcing a <em>move</em> of
<code>data</code> (e.g., by writing <code>{data}.split_first_mut()</code>), but you
shouldn&rsquo;t have to. (This issue has been filed for some time as
<a href="https://github.com/rust-lang/rust/issues/10520">#10520</a>, which also lists some other workarounds.)</p>
<h3 id="draft-rfc">Draft RFC</h3>
<p>The Draft RFC is almost complete. I&rsquo;ve created
<a href="https://github.com/nikomatsakis/nll-rfc/">a GitHub repository</a> containing the text. I&rsquo;ve also opened
issues with some of the things I wanted to get done before posting it,
though the descriptions are vague and it&rsquo;s not clear that all of them
are necessary. If you&rsquo;re interested in helping out &ndash; please, read it
over! Open issues on things that you find confusing, or open PRs with
suggestions, typos, whatever. I&rsquo;d like to make this RFC into a group
effort.</p>
<h3 id="the-prototype">The prototype</h3>
<p>The other thing that I&rsquo;m pretty excited about is that I have a
<a href="https://github.com/nikomatsakis/nll/">working prototype of these ideas</a>. The prototype takes as
input individual <code>.nll</code> files, each of which contains a few struct
definitions as well as the control-flow graph of a single function.
The tests are aimed at demonstrating some particular scenario. For
example, the <a href="https://github.com/nikomatsakis/nll/blob/724156e86236052fb6c483e2359d99c47dd29dc7/test/borrowck-walk-linked-list.nll"><code>borrowck-walk-linked-list.nll</code></a> test covers the
&ldquo;reference overwrites&rdquo; that I was talking about earlier. I&rsquo;ll go over
it in some detail to give you the idea.</p>
<p>The test begins with struct declarations. These are written in a
<em>very</em> concise form because I was too lazy to make it more
user-friendly:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">List</span><span class="o">&lt;+&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">value</span>: <span class="mi">0</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">successor</span>: <span class="nb">Box</span><span class="o">&lt;</span><span class="n">List</span><span class="o">&lt;</span><span class="mi">0</span><span class="o">&gt;&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// Equivalent to:
</span></span></span><span class="line"><span class="cl"><span class="c1">// struct List&lt;T&gt; {
</span></span></span><span class="line"><span class="cl"><span class="c1">//   value: T,
</span></span></span><span class="line"><span class="cl"><span class="c1">//   successor: Box&lt;List&lt;T&gt;&gt;
</span></span></span><span class="line"><span class="cl"><span class="c1">// }
</span></span></span></code></pre></div><p>As you can see, the type parameters are not named. Instead, we specify
the variance (<code>+</code> here means &ldquo;covariant&rdquo;). Within the function body,
we reference type parameters via a number, counting backwards from the
end of the list. Since there is only one parameter (<code>T</code>, in the Rust
example), then <code>0</code> refers to <code>T</code>.</p>
<p>(In real life, this struct would use <code>Option&lt;Box&lt;List&lt;T&gt;&gt;&gt;</code>, but the
prototype
<a href="https://github.com/nikomatsakis/nll/issues/8">doesn&rsquo;t model enums yet</a>,
so this is using a simplified form that is &ldquo;close enough&rdquo; from the
point-of-view of the checker itself. We also
<a href="https://github.com/nikomatsakis/nll/issues/10">don&rsquo;t model raw pointers yet</a>.
PRs welcome!)</p>
<p>After the struct definitions, there are some <code>let</code> declarations,
declaring the global variables:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">list</span>: <span class="kp">&amp;</span><span class="na">&#39;list</span> <span class="nc">mut</span><span class="w"> </span><span class="n">List</span><span class="o">&lt;</span><span class="p">()</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">value</span>: <span class="kp">&amp;</span><span class="na">&#39;value</span> <span class="nc">mut</span><span class="w"> </span><span class="p">();</span><span class="w">
</span></span></span></code></pre></div><p>Perhaps surprisingly, the named lifetimes like <code>'list</code> and <code>'value</code>
correspond to <strong>inference variables</strong>. That is, they are not like
named lifetimes in a Rust function &ndash; which are the one major thing
I&rsquo;ve yet to implement &ndash; but rather correspond to inference
variables. Giving them names allows for us to add &ldquo;assertions&rdquo; (we&rsquo;ll
see one later) that test what results got inferred. You can also use
<code>'_</code> to have the parser generate a unique name for you if you don&rsquo;t
feel like giving an explicit one.</p>
<p>After the local variables, comes the control-flow graph declarations,
as a series of basic-block declarations:</p>
<pre tabindex="0"><code>block START {
    list = use();
    goto LOOP;
}
</code></pre><p>Here, <code>list = use()</code> means &ldquo;initialize <code>list</code> and use the (empty) list
of arguments&rdquo;. I&rsquo;d like to improve this to support
<a href="https://github.com/nikomatsakis/nll/issues/60">named function prototypes</a>,
but for now the prototype just has the idea of an &lsquo;opaque use&rsquo;. Basic
blocks can optionally have successors, specified using <code>goto</code>.</p>
<p>One thing the prototype understands pretty well are borrows:</p>
<pre tabindex="0"><code>block LOOP {
    value = &amp;&#39;b1 mut (*list).value;
    list = &amp;&#39;b2 mut (*list).successor.data;
    use(value);
    goto LOOP EXIT;
}
</code></pre><p>An expression like <code>&amp;'b1 mut (*list).value</code> borrows <code>(*list).value</code>
mutably for the lifetime <code>'b1</code> &ndash; note that the lifetime of the borrow
itself is independent from the lifetime where the reference ends
up. Perhaps surprisingly, the reference can have a <em>bigger</em> lifetime
than the borrow itself: in particular, a single reference variable may
be assigned from multiple borrows in disjoint parts of the graph.</p>
<p>Finally, the tests support two kinds of assertions. First, you can
mark a given line of code as being &ldquo;in error&rdquo; by adding a <code>//!</code>
comment. There isn&rsquo;t one in this example, but you can see them
<a href="https://github.com/nikomatsakis/nll/blob/724156e86236052fb6c483e2359d99c47dd29dc7/test/borrowck-read-struct-containing-shared-ref-whose-referent-is-borrowed.nll#L11">in other tests</a>; these identify errors that the borrow checker
would report. We can also have <strong>assertions</strong> of various kinds. These
check the output from lifetime inference. This test has a single
assertion:</p>
<pre tabindex="0"><code>assert LOOP/0 in &#39;b2;
</code></pre><p>This assertion specifies that the point <code>LOOP/0</code> (that is, the start
of the loop) is contained within the lifetime <code>'b2</code> &ndash; that is, we
realize that the reference produced by <code>(*list).successor.data</code> may
still be in use at <code>LOOP/0</code>. But note that this does not prevent us
from reassigning <code>list</code> (nor borrowing <code>(*list).successor.data</code>). This
is because the new borrow checker is smart enough to understand that
<code>list</code> has been reassigned in the meantime, and hence that the borrows
from different loop iterations do not overlap.</p>
<h3 id="conclusion-and-how-you-can-help">Conclusion and how you can help</h3>
<p>I think the NLL proposal itself is close to being ready to submit &ndash; I
want to add a section on named lifetimes first, and add them to the
prototype &ndash; but there is still lots of interesting work to be
done. Naturally, reading and improving the RFC would be
useful. However, I&rsquo;d also like to improve the prototype. I would like
to see it evolve into a more complete &ndash; but simplified &ndash; model of
the borrow checker, that could serve as a good basis for analyzing the
Rust type system and investigating extensions. Ideally, we would merge
it with chalk, as the two complement one another: <strong>put together, they
form a fairly complete model of the Rust type system</strong> (the missing
piece is the initial round of type checking and coercion, which I
would eventually like to model in chalk anyhow). If this vision
interests you, please reach out! I have open issues on both projects,
though I&rsquo;ve not had time to write in tons of details &ndash; leave a
comment if something sparks your interest, and I&rsquo;d be happy to give
more details and mentor it to completion as well.</p>
<h3 id="questions-or-comments">Questions or comments?</h3>
<p><a href="https://internals.rust-lang.org/t/non-lexical-lifetimes-draft-rfc-prototype/5527">Take it to internals!</a></p>
<h4 id="appendix-what-the-proposal-wont-fix">Appendix: What the proposal won&rsquo;t fix</h4>
<p>I also want to mention a few kinds of borrow check errors that the
current RFC will <strong>not</strong> eliminate &ndash; and is not intended to. These
are generally errors that cross procedural boundaries in some form or
another.  For each case, I&rsquo;ll give a short example, and give some
pointers to the current thinking in how we might address it.</p>
<p><strong>Closure desugaring.</strong> The first kind of error has to do with the
closure desugaring. Right now, closures always capture local
variables, even if the closure only uses some sub-path of the variable
internally:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">get_len</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">vec</span><span class="p">.</span><span class="n">len</span><span class="p">();</span><span class="w"> </span><span class="c1">// borrows `self`, not `self.vec`
</span></span></span><span class="line"><span class="cl"><span class="bp">self</span><span class="p">.</span><span class="n">vec2</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="o">..</span><span class="p">.);</span><span class="w"> </span><span class="c1">// error: self is borrowed
</span></span></span></code></pre></div><p>This was discussed on <a href="https://internals.rust-lang.org/t/borrow-the-full-stable-name-in-closures-for-ergonomics/5387">an internals thread</a>; as I
<a href="https://internals.rust-lang.org/t/borrow-the-full-stable-name-in-closures-for-ergonomics/5387/11?u=nikomatsakis">commented there</a>, I&rsquo;d like to fix this by making the closure
desugaring smarter, and I&rsquo;d love to mentor someone through such an
RFC! However, it is out of scope for this one, since it does not
concern the borrow check itself, but rather the details of the closure
transformation.</p>
<p><strong>Disjoint fields across functions.</strong> Another kind of error is when
you have one method that only uses a field <code>a</code> and another that only
uses some field <code>b</code>; right now, you can&rsquo;t express that, and hence
these two methods cannot be used &ldquo;in parallel&rdquo; with one another:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">get_a</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kp">&amp;</span><span class="nc">A</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">&amp;</span><span class="bp">self</span><span class="p">.</span><span class="n">a</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">inc_b</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">b</span><span class="p">.</span><span class="n">value</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">bar</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">a</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">get_a</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">self</span><span class="p">.</span><span class="n">inc_b</span><span class="p">();</span><span class="w"> </span><span class="c1">// Error: self is already borrowed
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">use</span><span class="p">(</span><span class="n">a</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The fix for this is to refactor so as to expose the fact that the methods
operate on disjoint data. For example, one can factor out the methods into
methods on the fields themselves:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">bar</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="n">a</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">a</span><span class="p">.</span><span class="n">get</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="bp">self</span><span class="p">.</span><span class="n">b</span><span class="p">.</span><span class="n">inc</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">use</span><span class="p">(</span><span class="n">a</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This way, when looking at <code>bar()</code> alone, we see borrows of <code>self.a</code>
and <code>self.b</code>, rather than two borrows of <code>self</code>. Another technique is
to introduce &ldquo;free functions&rdquo; (e.g., <code>get(&amp;self.a)</code> and <code>inc(&amp;mut self.b)</code>) that expose more clearly which fields are operated upon, or
to inline the method bodies. I&rsquo;d like to fix this, but there are a lot
of considerations at play: see
<a href="https://internals.rust-lang.org/t/partially-borrowed-moved-struct-types/5392/2">this comment on an internals thread</a> for my current thoughts. (A
similar problem sometimes arises around <code>Box&lt;T&gt;</code> and other smart
pointer types; the desugaring leads to rustc being more conservative
than you might expect.)</p>
<p><strong>Self-referential structs.</strong> The final limitation we are not fixing
yet is the inability to have &ldquo;self-referential structs&rdquo;. That is, you
cannot have a struct that stores, within itself, an arena and pointers
into that arena, and then move that struct around. This comes up in a
number of settings.  There are various workarounds: sometimes you can
use a vector with indices, for example, or
<a href="https://crates.io/crates/owning_ref">the <code>owning_ref</code> crate</a>. The
latter, when combined with <a href="https://github.com/rust-lang/rfcs/pull/1598">associated type constructors</a>, might
be an adequate solution for some uses cases, actually (it&rsquo;s basically
a way of modeling &ldquo;existential lifetimes&rdquo; in library code). For the
case of futures especially, <a href="https://github.com/rust-lang/rfcs/pull/1858">the <code>?Move</code> RFC</a> proposes another
lightweight and interesting approach.</p>
<!-- links -->
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/nll" term="nll" label="NLL"/></entry><entry><title type="html">Query structure in chalk</title><link href="https://smallcultfollowing.com/babysteps/blog/2017/05/25/query-structure-in-chalk/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2017/05/25/query-structure-in-chalk/</id><published>2017-05-25T00:00:00+00:00</published><updated>2017-05-25T00:00:00+00:00</updated><content type="html"><![CDATA[<p>For my next post discussing <a href="https://github.com/nikomatsakis/chalk/">chalk</a>, I want to take kind of a
different turn. I want to talk about the general struct of <strong>chalk
queries</strong> and how chalk handles them right now. (If you&rsquo;ve never heard
of chalk, it&rsquo;s sort of &ldquo;reference implementation&rdquo; for Rust&rsquo;s trait
system, as well as an attempt to describe Rust&rsquo;s trait system in terms
of its logical underpinnings; see
<a href="https://smallcultfollowing.com/babysteps/
/blog/2017/01/26/lowering-rust-traits-to-logic/">this post for an introduction to the big idea</a>.)</p>
<h3 id="the-traditional-interactive-prolog-query">The traditional, interactive Prolog query</h3>
<p>In a traditional Prolog system, when you start a query, the solver
will run off and start supplying you with every possible answer it can
find. So if I put something like this (I&rsquo;m going to start adopting a
more Rust-like syntax for queries, versus the Prolog-like syntax I
have been using):</p>
<pre><code>?- Vec&lt;i32&gt;: AsRef&lt;?U&gt;
</code></pre>
<p>The solver might answer:</p>
<pre><code>Vec&lt;i32&gt;: AsRef&lt;[i32]&gt;
    continue? (y/n)
</code></pre>
<p>This <code>continue</code> bit is interesting. The idea in Prolog is that the
solver is finding <strong>all possible</strong> instantiations of your query
that are true. In this case, if we instantiate <code>?U = [i32]</code>, then the
query is true (note that the solver did not, directly, tell us a value
for <code>?U</code>, but we can infer one by unifying the response with our
original query). If we were to hit <code>y</code>, the solver might then give us
another possible answer:</p>
<pre><code>Vec&lt;i32&gt;: AsRef&lt;Vec&lt;i32&gt;&gt;
    continue? (y/n)
</code></pre>
<p>This answer derives from the fact that there is a reflexive impl
(<code>impl&lt;T&gt; AsRef&lt;T&gt; for T</code>) for <code>AsRef</code>. If were to hit <code>y</code> again,
then we might get back a negative response:</p>
<pre><code>no
</code></pre>
<p>Naturally, in some cases, there may be no possible answers, and hence
the solver will just give me back <code>no</code> right away:</p>
<pre><code>?- Box&lt;i32&gt;: Copy
    no
</code></pre>
<p>In some cases, there might be an infinite number of responses. So for
example if I gave this query, and I kept hitting <code>y</code>, then the solver
would never stop giving me back answers:</p>
<pre><code>?- Vec&lt;?U&gt;: Clone
   Vec&lt;i32&gt;: Clone
     continue? (y/n)
   Vec&lt;Box&lt;i32&gt;&gt;: Clone
     continue? (y/n)
   Vec&lt;Box&lt;Box&lt;i32&gt;&gt;&gt;: Clone
     continue? (y/n)
   Vec&lt;Box&lt;Box&lt;Box&lt;i32&gt;&gt;&gt;&gt;: Clone
     continue? (y/n)
</code></pre>
<p>As you can imagine, the solver will gleefully keep adding another
layer of <code>Box</code> until we ask it to stop, or it runs out of memory.</p>
<p>Another interesting thing is that queries might still have variables
in them. For example:</p>
<pre><code>?- Rc&lt;?T&gt;: Clone
</code></pre>
<p>might produce the answer:</p>
<pre><code>Rc&lt;?T&gt;: Clone
    continue? (y/n)
</code></pre>
<p>After all, <code>Rc&lt;?T&gt;</code> is true <strong>no matter what type <code>?T</code> is</strong>.</p>
<h3 id="do-try-this-at-home-chalk-has-a-repl">Do try this at home: chalk has a REPL</h3>
<p>I should just note that ever since
<a href="https://github.com/nikomatsakis/chalk/pull/30/">aturon recently added a REPL to chalk</a>,
which means that &ndash; if you want &ndash; you can experiment with some of the
examples from this blog post. It&rsquo;s not really a &ldquo;polished tool&rdquo;, but
it&rsquo;s kind of fun. I&rsquo;ll give my examples using the REPL.</p>
<h3 id="how-chalk-responds-to-a-query">How chalk responds to a query</h3>
<p>chalk responds to queries somewhat differently. Instead of trying to
enumerate <strong>all possible</strong> answers for you, it is looking for an
<strong>unambiguous</strong> answer. In particular, when it tells you the value for
a type variable, that means that this is the <strong>only possible
instantiation</strong> that you could use, given the current set of impls and
where-clauses, that would be provable.</p>
<p>Overall, chalk&rsquo;s answers have three parts:</p>
<ul>
<li><strong>Status:</strong> Yes, No, or Maybe</li>
<li><strong>Refined goal:</strong> a version of your original query with some substitutions
applied</li>
<li><strong>Lifetime constraints:</strong> these are relations that must hold between
the lifetimes that you supplied as inputs. I&rsquo;ll come to this in a
bit.</li>
</ul>
<p><em>Future compatibility note:</em> It&rsquo;s worth pointing out that I expect
some the particulars of a &ldquo;query response&rdquo; to change, particularly as
aturon continues <a href="http://aturon.github.io/blog/2017/04/24/negative-chalk/">the work on negative reasoning</a>. I&rsquo;m
presenting the current setup here, for the most part, but I also
describe some of the changes that are in flight (and expected to land
quite soon).</p>
<p>Let&rsquo;s look at these three parts in turn.</p>
<h3 id="the-status-and-refined-goal-of-a-query-response">The <strong>status</strong> and <strong>refined goal</strong> of a query response</h3>
<p>The &ldquo;status&rdquo; tells you how sure chalk is of its answer, and it can be
<strong>yes</strong>, <strong>maybe</strong>, or <strong>no</strong>.</p>
<p>A <strong>yes</strong> response means that your query is <strong>uniquely provable</strong>, and
in that case the refined goal that we&rsquo;ve given back represents the
only possible instantiation. In the examples we&rsquo;ve seen so far, there
was one case where chalk would have responded with yes:</p>
<pre tabindex="0"><code>&gt; cargo run
?- load libstd.chalk
?- exists&lt;T&gt; { Rc&lt;T&gt;: Clone }
Solution {
    successful: Yes,
    refined_goal: Query {
        value: Constrained {
            value: [
                Rc&lt;?0&gt;: Clone
            ],
            constraints: []
        },
        binders: [
            U0
        ]
    }
}
</code></pre><p>(Since this is the first example using the REPL, a bit of explanation
is in order. First, <code>cargo run</code> executs the REPL, naturally. The first
command, <code>load libstd.chalk</code>, loads up some standard type/impl
definitions.  The next command, <code>exists&lt;T&gt; { Rc&lt;T&gt;: Clone }</code> is the
actual <em>query</em>.  In the section of Prolog examples, I used the Prolog
convention, which is to implicitly add the &ldquo;existential quantifiers&rdquo;
based on syntax. chalk is more explicit: writing <code>exists&lt;T&gt; { ... }</code>
here is saying &ldquo;is there a <code>T</code> such that <code>...</code> is true?&rdquo;. In future
examples, I&rsquo;ll skip over the first two lines.)</p>
<p>You can see that the response here (which is just the <code>Debug</code> impl for
chalk&rsquo;s internal data structures) included not only <code>Yes</code>, but also a
&ldquo;refined-goal&rdquo;. I don&rsquo;t want to go into all the details of how the
refined goal is represented just now, but if you skip down to the
<code>value</code> field you will pick out the string <code>Rc&lt;?0&gt;: Clone</code> &ndash; here the
<code>?0</code> indicates an existential variable. This is saying thatthe
&ldquo;refined&rdquo; goal is the same as the query, meaning that <code>Rc&lt;T&gt;: Clone</code>
is true no matter what <code>Clone</code> is. (We saw the same thing in the
Prolog case.)</p>
<p>So what about some of the more ambiguous cases. For example, what
happens if we ask <code>exists&lt;T&gt; { Vec&lt;T&gt;: Clone }</code>. This case is
trickier, because for <code>Vec&lt;T&gt;</code> to be clone, <code>T</code> must be <code>Clone</code>, so it
matters what <code>T</code> is:</p>
<pre tabindex="0"><code>?- exists&lt;T&gt; { Vec&lt;T&gt;: Clone }
Solution {
    successful: Maybe,
    ... // elided for brevity
}
</code></pre><p>Here we get back <strong>maybe</strong>. This is chalk&rsquo;s way of saying that the
query is provable for some instants of <code>?T</code>, but we need more type
information to find a <em>unique</em> answer. The idea is that we will
continue type-checking or processing in the meantime, which may yield
results that further constrain <code>?T</code>; e.g., maybe we find a call to
<code>vec.push(22)</code>, indicating that the type of the values within is
<code>i32</code>. Once that happens, we can repeat the query, but this time with
a more specific value for <code>?T</code>, so something like <code>Vec&lt;i32&gt;: Clone</code>:</p>
<pre tabindex="0"><code>?- Vec&lt;i32&gt;: Clone
Solution {
    successful: Yes,
    ...
}
</code></pre><p>Finally, some times chalk can decisively prove that something is not
provable. This would occur if there is just no impl that could
possibly apply (but see <a href="http://aturon.github.io/blog/2017/04/24/negative-chalk/">aturon&rsquo;s post</a>, which covers how we
plan to extend chalk to be able to reason beyond a single crate):</p>
<pre tabindex="0"><code>?- Box&lt;i32&gt;: Copy
`Copy` is not implemented for `Box&lt;i32&gt;` in environment `Env(U0, [])`
</code></pre><h3 id="refined-goal-in-action">Refined goal in action</h3>
<p>The refined goal so far hasn&rsquo;t been very important; but it&rsquo;s generally
a way for the solver to communicate back a kind of <strong>substitution</strong> &ndash;
that is, to communicate back what values the type variables have to
have in order for the query to be provable. Consider this query:</p>
<pre tabindex="0"><code>?- exists&lt;U&gt; { Vec&lt;i32&gt;: AsRef&lt;Vec&lt;U&gt;&gt; }
</code></pre><p>Now, in general, a <code>Vec&lt;i32&gt;</code> implements <code>AsRef</code> twice:</p>
<ul>
<li><code>Vec&lt;i32&gt;: AsRef&lt;Slice&lt;i32&gt;&gt;</code> (chalk doesn&rsquo;t understand the syntax <code>[i32]</code>, so I made a type <code>Slice</code> for it)</li>
<li><code>Vec&lt;i32&gt;: AsRef&lt;Vec&lt;i32&gt;&gt;</code></li>
</ul>
<p>But here, we know we are looking for <code>AsRef&lt;Vec&lt;U&gt;&gt;</code>. This implies
then that <code>U</code> must be <code>i32</code>. And indeed, if we give this query, chalk
tells us so, using the refined goal:</p>
<pre tabindex="0"><code>?- exists&lt;U&gt; { Vec&lt;i32&gt;: AsRef&lt;Vec&lt;U&gt;&gt; }
Solution {
    successful: Yes,
    refined_goal: Query {
        value: Constrained {
            value: [
                Vec&lt;i32&gt;: AsRef&lt;Vec&lt;i32&gt;&gt;
            ],
            constraints: []
        },
        binders: []
    }
}
</code></pre><p>Here you can see that there are no variables. Instead, we see
<code>Vec&lt;i32&gt;: AsRef&lt;Vec&lt;i32&gt;&gt;</code>. If we unify this with our original query
(skipping past the <code>exists</code> part), we can deduce that <code>U = i32</code>.</p>
<p>You might imagine that the refined goal can only be used when the
response is <strong>yes</strong> &ndash; but, in fact, this is not so. There are times
when we can&rsquo;t say for sure if a query is provable, but we can still
say something about what the variables must be for it to be provable.
Consider this example:</p>
<pre tabindex="0"><code>?- exists&lt;U, V&gt; { Vec&lt;Vec&lt;U&gt;&gt;: AsRef&lt;Vec&lt;V&gt;&gt; }
Solution {
    successful: Maybe,
    refined_goal: Query {
        value: Constrained {
            value: [
                Vec&lt;Vec&lt;?0&gt;&gt;: AsRef&lt;Vec&lt;Vec&lt;?0&gt;&gt;&gt;
            ],
            constraints: []
        },
        binders: [
            U0
        ]
    }
}
</code></pre><p>Here, we were asking if <code>Vec&lt;Vec&lt;U&gt;&gt;</code> implements <code>AsRef&lt;Vec&lt;V&gt;&gt;</code>. We
got back a <strong>maybe</strong> response. This is because the <code>AsRef</code> impl
requires us to know that <code>U: Sized</code>, and naturally there are many
sized types that <code>U</code> could be, so we need to wait until we get more
information to give back a definitive response.</p>
<p>However, leaving aside concerns about <code>U: Sized</code>, we can see that
<code>Vec&lt;Vec&lt;U&gt;&gt;</code> must equal <code>Vec&lt;V&gt;</code>, which implies that, for this query
to be provable, <code>Vec&lt;U&gt; = V</code> must hold. And the refined goal reflects
as much:</p>
<pre tabindex="0"><code>Vec&lt;Vec&lt;?0&gt;&gt;: AsRef&lt;Vec&lt;Vec&lt;?0&gt;&gt;&gt;
</code></pre><h3 id="open-vs-closed-queries">Open vs closed queries</h3>
<p>Queries in chalk are always &ldquo;closed&rdquo; formulas, meaning that all the
variables that they reference are bound by either an <code>exists&lt;T&gt;</code> or a
<code>forall&lt;T&gt;</code> binder. This is in contrast to how the compiler works, or
a typical prolog implementation, where a trait query occurs in the
context of an ongoing set of processing. In terms of the current rustc
implementation, the difference is that, in rustc, when you wish to do
some trait selection, you invoke the trait solver with an inference
context in hand.  This defines the context for any inference variables
that appear in the query.</p>
<p>In chalk, in contrast, the query starts with a &ldquo;clean slate&rdquo;. The only
context that it needs is the global context of the entire program &ndash;
i.e., the set of impls and so forth (and you can consider those part
of the query, if you like).</p>
<p>To see the difference, consider this chalk query that we looked at earlier:</p>
<pre tabindex="0"><code>?- exists&lt;U&gt; { Vec&lt;i32&gt;: AsRef&lt;Vec&lt;U&gt;&gt; }
</code></pre><p>In rustc, such a query would look more like <code>Vec&lt;i32&gt;: AsRef&lt;Vec&lt;?22&gt;&gt;</code>, where we have simply used an existing inference
variable (<code>?22</code>). Moreover, the current implementation simply gives
back the yes/maybe/no part of the response, and does not have a notion
of a refined goal. This is because, since we have access to the raw
inference variable, we can just unify <code>?22</code> (e.g., with <code>i32</code>) as a
side-effect of processing the query.</p>
<p>The new idea then is that when some part of the compiler needs to
prove a goal like <code>Vec&lt;i32&gt;: AsRef&lt;Vec&lt;?22&gt;&gt;</code>, it will first create a
<strong>canonical</strong> query from that goal
(<a href="https://github.com/nikomatsakis/chalk/blob/1f63c8ad20d27f3ef394f230a56430c89482d8d4/src/solve/infer/query.rs#L10-L39">chalk code is in <code>query.rs</code></a>). This is done by replacing
all the random inference variables (like <code>?22</code>) with existentials. So
you would get <code>exists&lt;T&gt; Vec&lt;i32&gt;: AsRef&lt;Vec&lt;T&gt;&gt;</code> as the output. One
key point is that this query is independent of the precise inference
variables involved: so if we have to solve this same query later, but
with different inference variables (e.g., <code>Vec&lt;i32&gt;: AsRef&lt;Vec&lt;?44&gt;&gt;</code>), when we make the canonical form of that query, we&rsquo;d
get the same result.</p>
<p>Once we have the canonical query, we can
<a href="https://github.com/nikomatsakis/chalk/blob/1f63c8ad20d27f3ef394f230a56430c89482d8d4/src/solve/solver/mod.rs#L27-L29">invoke chalk&rsquo;s solver</a>. The
code here varies depending on the kind of goal, but the basic strategy
is the same. We create a
<a href="https://github.com/nikomatsakis/chalk/blob/1f63c8ad20d27f3ef394f230a56430c89482d8d4/src/solve/fulfill.rs">&ldquo;fulfillment context&rdquo;</a>,
which is the combination of
<a href="https://github.com/nikomatsakis/chalk/blob/1f63c8ad20d27f3ef394f230a56430c89482d8d4/src/solve/fulfill.rs#L14">an inference context</a>
(a set of inference variables) and
<a href="https://github.com/nikomatsakis/chalk/blob/1f63c8ad20d27f3ef394f230a56430c89482d8d4/src/solve/fulfill.rs#L15">a list of goals we have yet to prove</a>. (The
compiler has a similar data structure, but it is setup somewhat
differently; for example, it doesn&rsquo;t own an inference context itself.)</p>
<p>Within this fulfillment context, we can
<a href="https://github.com/nikomatsakis/chalk/blob/1f63c8ad20d27f3ef394f230a56430c89482d8d4/src/solve/fulfill.rs#L34-L39">&ldquo;instantiate&rdquo;</a>
the query, which means that we replace all the variables bound in an
<code>exists&lt;&gt;</code> binder with an inference variable (here is
<a href="https://github.com/nikomatsakis/chalk/blob/1f63c8ad20d27f3ef394f230a56430c89482d8d4/src/solve/match_program_clause.rs#L28">an example of code invoking <code>instantiate()</code></a>. This
effectively converts back to the original form, but with fresh
inference variables. So <code>exists&lt;T&gt; Vec&lt;i32&gt;: AsRef&lt;Vec&lt;T&gt;&gt;</code> would
become <code>Vec&lt;i32&gt;: AsRef&lt;Vec&lt;?0&gt;&gt;</code>. Next we can actually try to prove
the goal, for example by searching through each impl,
<a href="https://github.com/nikomatsakis/chalk/blob/1f63c8ad20d27f3ef394f230a56430c89482d8d4/src/solve/match_program_clause.rs#L42-L44">unifying the goal with the impl header</a>,
and then
<a href="https://github.com/nikomatsakis/chalk/blob/1f63c8ad20d27f3ef394f230a56430c89482d8d4/src/solve/match_program_clause.rs#L45-L49">recursively processing the where-clauses on the impl</a>
to make sure they are satisfied.</p>
<p>An advantage of the chalk approach where queries are closed is that
they are much easier to cache. We can solve the query once and then
&ldquo;replay&rdquo; the result an endless number of times, so long as the
enclosing context is the same.</p>
<h3 id="lifetime-constraints">Lifetime constraints</h3>
<p>I&rsquo;ve glossed over one important aspect of how chalk handles queries,
which is the treatment of <strong>lifetimes</strong>. In addition to the refined
goal, the response from a chalk query also includes a set of
<strong>lifetime constraints</strong>. Roughly speaking, the model is that the
chalk engine gives you back the lifetime constraints that <em>would have
to be satisfied</em> for the query to be provable.</p>
<p>In other words, if you have a full, lifetime-aware logic, you might
say that the query is provable in some environment <code>Env</code> that also
includes some facts about the lifetimes (i.e., which lifetime outlives
which other lifetime, and so forth):</p>
<pre><code>Env, LifetimeEnv |- Query
</code></pre>
<p>but in chalk we are only giving in <code>Env</code>, and the engine is giving
back to us a <code>LifetimeEnv</code>:</p>
<pre><code>chalk(Env, Query) = LifetimeEnv
</code></pre>
<p>with the intention that we know that if we can prove that <code>LifetimeEnv</code>
holds, then <code>Query</code> also holds.</p>
<p>One of the main reasons for this split is that we want to ensure that
the results from a chalk query do not depend on the specific lifetimes
involved. This is because, in part, we are going to be solving chalk
queries in contexts when lifetimes have been fully erased, and hence
we don&rsquo;t actually <em>know</em> the original lifetimes or their relationships
to one another.  (In this case, the idea is roughly that we will get
back a <code>LifetimeEnv</code> with the relationships that would have to hold,
but we can be sure that an earlier phase in the compiler has proven to
us that this <code>LifetimeEnv</code> will be satisfied.)</p>
<p>Anyway, I plan to write a follow-up post (or more&hellip;) focusing just on
lifetime constraints, so I&rsquo;ll leave it at that for now. This is also
an area where we are doing some iteration, particularly because of the
interactions with specialization, which are complex.</p>
<h3 id="future-plans">Future plans</h3>
<p>Let me stop here to talk a bit about the changes we have
planned. aturon has been working on a branch that makes a few key
changes. First, we will <strong>replace the notion of &ldquo;refined goal&rdquo; with a
more straight-up substitution</strong>. That is, we&rsquo;d like chalk to answer
back with something that just tells you the values for the variables
you&rsquo;ve given.  This will make later parts of the query processing
easier.</p>
<p>Second, following the approach that
<a href="http://aturon.github.io/blog/2017/04/24/negative-chalk/">aturon outlined in their blog post</a>, when you get back a
&ldquo;maybe&rdquo; result, we are actually going to be considering two cases. The
current code will return a refined substitution only if there is a
<strong>unique assignment to your input variables that must be true</strong> for
the goal to be provable. But in the newer code, we will also have the
option to return a &ldquo;suggestion&rdquo; &ndash; something which isn&rsquo;t <strong>necessary</strong>
for the goal to be provable, but which we think is likely to be what
the user wanted. We hope to use this concept to help replicate, in a
more structured and bulletproof way, some of the heuristics that are
used in rustc itself.</p>
<p>Finally, we plan to implement the &ldquo;modal logic&rdquo; operators, so that you
can make queries that explicitly reason about &ldquo;all crates&rdquo; vs &ldquo;this
crate&rdquo;.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/traits" term="traits" label="Traits"/><category scheme="https://smallcultfollowing.com/babysteps/categories/chalk" term="chalk" label="Chalk"/><category scheme="https://smallcultfollowing.com/babysteps/categories/pl" term="pl" label="PL"/></entry><entry><title type="html">gnome-class: Integrating Rust and the GNOME object system</title><link href="https://smallcultfollowing.com/babysteps/blog/2017/05/02/gnome-class-integrating-rust-and-the-gnome-object-system/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2017/05/02/gnome-class-integrating-rust-and-the-gnome-object-system/</id><published>2017-05-02T00:00:00+00:00</published><updated>2017-05-02T00:00:00+00:00</updated><content type="html"><![CDATA[<p>I recently participated in the GNOME / Rust &ldquo;dev sprint&rdquo; in Mexico
City. (A thousand thanks to Federico and Joaquin for organizing!)
While there I spent some time working on the
<a href="https://github.com/nikomatsakis/gnome-class">gnome-class plugin</a>. The
goal of gnome-class was to make it easy to write GObject
implementations in Rust which would fully interoperate with C code.</p>
<p>Roughly speaking, my goal was that you should be able to write code
that looked and felt like
<a href="https://wiki.gnome.org/Projects/Vala">Vala code</a>, but where the
method bodies (and types, and so forth) are in Rust. The plugin is in
no way done, but I think it&rsquo;s already letting you do some pretty nice
stuff. For example, this little snippet defines a <code>Counter</code> class
offering two methods (<code>add()</code> and <code>get()</code>):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="fm">gobject_gen!</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">class</span><span class="w"> </span><span class="n">Counter</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">struct</span> <span class="nc">CounterPrivate</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">f</span>: <span class="nc">Cell</span><span class="o">&lt;</span><span class="kt">u32</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">fn</span> <span class="nf">add</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">x</span>: <span class="kt">u32</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">u32</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="kd">let</span><span class="w"> </span><span class="n">private</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">private</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="kd">let</span><span class="w"> </span><span class="n">v</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">private</span><span class="p">.</span><span class="n">f</span><span class="p">.</span><span class="n">get</span><span class="p">()</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">x</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">private</span><span class="p">.</span><span class="n">f</span><span class="p">.</span><span class="n">set</span><span class="p">(</span><span class="n">v</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">v</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">fn</span> <span class="nf">get</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">u32</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="bp">self</span><span class="p">.</span><span class="n">private</span><span class="p">().</span><span class="n">f</span><span class="p">.</span><span class="n">get</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>You can access these classes from Rust code in a natural way:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">c</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Counter</span>::<span class="n">new</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">c</span><span class="p">.</span><span class="n">add</span><span class="p">(</span><span class="mi">2</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">c</span><span class="p">.</span><span class="n">add</span><span class="p">(</span><span class="mi">20</span><span class="p">);</span><span class="w">
</span></span></span></code></pre></div><p>Under the hood, this is all hooked up to the GNOME runtime. So, for
example, <code>Counter::new()</code> translates to a call to <code>g_object_new()</code>,
and the <code>c.add()</code> calls translate into virtual calls passing through
the GNOME class structure. We also generate <code>extern &quot;C&quot;</code> functions so
you should be able to call the various methods from C code.</p>
<p>Let&rsquo;s go through this example bit-by-bit and I&rsquo;ll show you what each
part works. Along the way, we can discuss the GNOME object
model. Finally, we can cover some of the alternative designs that I
considered and discarded, and a few things we could change in Rust to
make everything smoother.</p>
<h3 id="mapping-between-gnome-and-rust-ownership">Mapping between GNOME and Rust ownership</h3>
<p>The basic GNOME object model is that every object is ref-counted. In
general, if you are given a <code>Foo*</code> pointer, it is assumed you are
<em>borrowing</em> that ref, and it you want to store that <code>Foo*</code> value
somewhere, you should increment the ref-count for yourself. However,
there are other times when ownership transfer is assumed. (In general,
the GNOME has strong conventions here, which is great.)</p>
<p>I&rsquo;ve debating about how best to mirror this in Rust. My current branch
works as follows, using the type <code>Counter</code> as an example.</p>
<ul>
<li><code>Counter</code> represents an <strong>owned reference</strong> to a <code>Counter</code> object.
This is implicitly heap-allocated and reference-counted, per the
object model.
<ul>
<li><code>Counter</code> implements <code>Clone</code>, which will simply increment the reference
count but return the same object.</li>
<li><code>Counter</code> implements <code>Drop</code>, which will decrement the reference count.</li>
<li>In terms of its representation, <code>Counter</code> is a newtype&rsquo;d <code>*mut GObject</code>.</li>
</ul>
</li>
<li><code>&amp;Counter</code> is used for functions that wish to &ldquo;borrow&rdquo; a counter; if
they want to store a version for themselves, they can call
<code>clone()</code>.
<ul>
<li>Hence the methods like <code>add()</code> are <code>&amp;self</code> methods.</li>
<li>This works more-or-less exactly like passing around an <code>&amp;Rc&lt;T&gt;</code> or
<code>&amp;Arc&lt;T&gt;</code> (which, incidentally, is the style I&rsquo;ve started using
all of the time for working with ref-counted data).</li>
</ul>
</li>
</ul>
<p>Note that since every <code>Counter</code> is implicitly ref-counted data, there
isn&rsquo;t much point to working with an <code>&amp;mut Counter</code>. That is, you may
have a unique reference to a single handle, but you can&rsquo;t really know
how many aliases are of <code>Counter</code> are out there from other sources.
<strong>As a result, when you use <code>gnome_gen!</code>, all of the methods and so
forth that you define are always going to be <code>&amp;self</code> methods.</strong> In
other words, you will always get a <em>shared</em> reference to your data.</p>
<p>Because we have only shared references, the fields in your GNOME
classes are going to be immutable unless you package them up <code>Cell</code>
and <code>RefCell</code>. This is why the counter type, for example, stores its
count in a field <code>f: Cell&lt;u32&gt;</code> &ndash; the <code>Cell</code> type allows the counter
to be incremented and decremented even when aliased. It <em>does</em> imply
that it would be unsafe to share the <code>Counter</code> across multiple threads
at once; but this is roughly the default in GNOME (things cannot be
shared across threads unless they&rsquo;ve been designed for that).</p>
<h3 id="private-data-in-gnome">Private data in GNOME</h3>
<p>When it comes to data storage, the GNOME object model works a bit
differently than a &ldquo;traditional&rdquo; OO language like Java or C++. In
those more traditional languages, an object is laid out with the
vtable first, and then the fields from each class, concatenated in
order:</p>
<pre tabindex="0"><code>object --&gt; +-------------------+
           | vtable            |
           | ----------------- |
           | superclass fields |
           | ----------------- |
           | subclass fields   |
           +-------------------+
</code></pre><p>The nice thing about this is that the <code>object</code> pointer can safely be
used as either a <code>Superclass</code> pointer or a <code>Subclass</code> pointer.  But
there is a catch. If new fields are added to the superclass, then the
offset of all my subclass fields will change &ndash; this implies that all
code using my object as a <code>Subclass</code> has to be recompiled. What&rsquo;s
worse, this is true even if all I wanted to do is to add a <strong>private</strong>
field to the superclass. In other words, adding fields in this scheme
is an <strong>ABI-incompatible change</strong> &ndash; meaning that we have to recompile
all downstream code, even if we know that this compilation cannot
fail.</p>
<p>Therefore, the GNOME model works a bit differently. While you <em>can</em>
have fields allocated inline as I described, the recommendation is
instead to use a facility called &ldquo;private data&rdquo;. With private data,
you define a struct of fields accessible only to your class; these
fields are not stored &ldquo;inline&rdquo; in your object at some statically
predicted offset. Instead, when you allocate your object, the GNOME
memory manage will also allocate space for the private data each class
needs, and you can ask (dynamically) for the
offset. (<a href="#appendix-a-memory-layout-of-private-data">Appendix A</a> goes
into details on the actual memory layout.)</p>
<p>The <code>gobject_gen!</code> macro is setup to always use private data in the
recommended fashion. If take another look at the header, we can see
the private data struct for the <code>Counter</code> class is defined in the very
beginning, and given the name <code>CounterPrivate</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="fm">gobject_gen!</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">class</span><span class="w"> </span><span class="n">Counter</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">struct</span> <span class="nc">CounterPrivate</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">f</span>: <span class="nc">Cell</span><span class="o">&lt;</span><span class="kt">u32</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>In the code, when we want to access the &ldquo;private&rdquo; data, we use the
<code>private()</code> method. This will return to us a <code>&amp;CounterPrivate</code>
reference that we can use. For example, defining the <code>get()</code> method on
our counter looks like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">get</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">u32</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">self</span><span class="p">.</span><span class="n">private</span><span class="p">().</span><span class="n">f</span><span class="p">.</span><span class="n">get</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Although the offset of the private data for a particular class is not
known statically, it is still always constant in any given execution.
It&rsquo;s just that it can change from run to run if different versions of
libraries are in use. Therefore, in C code, most classes will inquire
once, during creation time, to find the offset of their private data,
and then store this result in a global variable. The current Rust code
just inquires dynamically every time.</p>
<h3 id="object-construction">Object construction</h3>
<p><code>gobject_gen!</code> does not expose traditional OO-style
constructors. Instead, you can define a function that produces the
initial values for your private struct &ndash; if you do not provide
anything, then we will use
<a href="https://doc.rust-lang.org/std/default/trait.Default.html">the Rust <code>Default</code> trait</a>.</p>
<p>The <code>Counter</code> example, in fact, provided no initialization function,
and hence it was using the <code>Default</code> trait to initialize the field <code>f</code>
to zero.  If we wanted to write this explicitly, we could have added
an <code>init { }</code> block. For example, the following variant will initialize
the counter to <code>22</code>, not <code>0</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="fm">gobject_gen!</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">class</span><span class="w"> </span><span class="n">Counter</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">struct</span> <span class="nc">CounterPrivate</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">f</span>: <span class="nc">Cell</span><span class="o">&lt;</span><span class="kt">u32</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">init</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">CounterPrivate</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="n">f</span>: <span class="nc">Cell</span>::<span class="n">new</span><span class="p">(</span><span class="mi">22</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">    
</span></span></span></code></pre></div><p>Note that <code>init</code> blocks take no parameters &ndash; at the time when it
executes, the object&rsquo;s memory is still not fully initialized, and
hence we can&rsquo;t safely give access it. (Unlike in Java, we don&rsquo;t
necessarily have a &ldquo;null&rdquo; value for all types.)</p>
<p>The general consensus at the design sprint was that the Best Practices
for writing a GNOME object was to avoid a &ldquo;custom constructor&rdquo; but
instead to define public properties and have creators specify those
properties at construction time. I did not yet model properties, but
it seems like that would fit nicely with this initialization
setup.There is also a hook that one can define that will execute once
all the &ldquo;initial set&rdquo; of properties have been initialized &ndash; I&rsquo;d like
to expose this too, but didn&rsquo;t get around to it. This would be similar
to <code>init</code>, presumably, except that it would give access to a <code>&amp;self</code>
pointer.</p>
<p>Similarly, we could extend <code>gobject_gen!</code> to offer a more
&ldquo;traditional&rdquo; OO constructor model, similar to the one that Vala
offers. This too would layer on top of the existing code: so your
<code>init()</code> function would run first, to generate the initial values for
the private fields, but then you could come afterwards and update
them, making use of the parameters. (You can model this today just by
defining an <code>fn initialize(&amp;self)</code> method, effectively.)</p>
<h3 id="what-still-needs-work">What still needs work?</h3>
<p>So we&rsquo;ve seen what does work (or what kind of works, in the case of
subclassing).  What work is left? Lots, it turns out. =)</p>
<h4 id="private-data-support-could-be-smoother">Private data support could be smoother</h4>
<p>I would prefer if you did not have to type <code>self.private()</code> to access
private data. I would rather if you could just do <code>self.f</code> to get
access to a private field <code>f</code>. For that to work, though, we&rsquo;d need to
have something like the
<a href="https://github.com/rust-lang/rfcs/pull/1546">fields in traits RFC</a> &ndash;
and probably an expanded version that has a few additional features.
In particular, we&rsquo;d need the ability to map through derefs, or
possibly through custom code; read-only fields would likely help
too. Now that this blog post is done, I plan to post a comment on that
RFC with some observations and try to get it moving again.</p>
<h4 id="interfacing-with-c">Interfacing with C</h4>
<p>I haven&rsquo;t really implemented this yet, but I wanted to sketch how I
envision that this macro could interface with C code. We already
handle the &ldquo;Rust&rdquo; side of this, which is that we generate C-compatible
functions for each method that do the ceorrect dispatch; these follow
the GNOME naming conventions (e.g., <code>Counter_add()</code> and
<code>Counter_get()</code>). I&rsquo;d also to have the macro to generate a <code>.h</code> file
for you (or perhaps this should be done by a <code>build.rs</code> script, I&rsquo;m
not yet sure), so that you can easily have C code include that <code>.h</code>
file and seamlessly use your Rust object.</p>
<h4 id="interfacing-with-gtk-rs">Interfacing with gtk-rs</h4>
<p>There has already been a lot of excellent work mirroring the various
GNOME APIs through the <a href="http://gtk-rs.org/">gtk-rs crates</a>. I&rsquo;m using
some of those APIs already, but we should do some more work to make
the crates more intercompatible.  I&rsquo;d love it if you easily subclass
existing classes from the GNOME libraries using <code>gnome_gen!</code>. It
should be possible to make this work, it&rsquo;ll just take some
coordination.</p>
<h4 id="making-it-more-convenient-to-work-with-shared-mutable-data">Making it more convenient to work with shared, mutable data</h4>
<p>Since all GNOME objects are shared, it becomes very important to have
ergonomic libraries for working with shared, mutable data. The
existing types in the standard library &ndash; <code>Cell</code> and <code>RefCell</code> &ndash; are
very general but not always the most pleasant to work with.</p>
<p>If nothing else, we could use some convenient types for other
scenarios, such as a <code>Final&lt;T&gt;</code> that corresponds to a &ldquo;write-once&rdquo;
variable (the name is obviously inspired by final fields in Java,
though ivars is another name commonly used in the parallel programming
community). <code>Final&lt;T&gt;</code> would be nice for fields that start out as null
but which are always initialized during construction and then never
changed again. The nice thing would be that <code>Final&lt;T&gt;</code> could implement
<code>Deref</code> (it would presumably panic if the value has not yet been
assigned).</p>
<h4 id="supporting-more-of-the-gnome-object-model">Supporting more of the GNOME object model</h4>
<p>There are also many parts of GNOME that we don&rsquo;t model yet.</p>
<p>We don&rsquo;t really support <strong>subclassing</strong> yet. I have a half-executed
plan for supporting it, but this is a topic worthy of a post of its
own, so I&rsquo;ll just leave it at that.</p>
<p><strong>Properties</strong> are probably the biggest thing; they are fairly simple conceptually,
but there are lots of knobs and whistles to get right.</p>
<p>We don&rsquo;t support <strong>constructing an object with a list of initial
property values</strong> nor do we support the <strong>post-initialization
hook</strong>. In C code, when constructing a GNOME object, once can use a
var-args style API to supply a bunch of initial values:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-C" data-lang="C"><span class="line"><span class="cl"><span class="nf">g_object_new</span><span class="p">(</span><span class="n">TYPE_MEDIA</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">             <span class="s">&#34;inventory-id&#34;</span><span class="p">,</span> <span class="mi">42</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">             <span class="s">&#34;orig-package&#34;</span><span class="p">,</span> <span class="n">FALSE</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">             <span class="nb">NULL</span><span class="p">);</span> 
</span></span></code></pre></div><p>I imagine modeling this in Rust using a builder pattern:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">Media</span>::<span class="n">with</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">inventory_id</span><span class="p">(</span><span class="mi">42</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">orig_package</span><span class="p">(</span><span class="kc">false</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">new</span><span class="p">()</span><span class="w">
</span></span></span></code></pre></div><p>We don&rsquo;t support <strong>signals</strong>, which are a kind of message bus system that I don&rsquo;t
really understand very well. =)</p>
<h4 id="procedural-macro-support-on-rust-is-young">Procedural macro support on Rust is young</h4>
<p>There is still a long ways to before the <code>gnome_gen!</code> plugin is really
usable. For one thing, it relies on a number of unstable Rust language
features &ndash; not the least of them being the new procedural macro
system. It also inherits one very annoying facet of the current
procedural macros, which is that all source location information is
lost. This means that if you have type errors in your code it just
gives you an error like &ldquo;somewhere in this usage of the <code>gnome_gen!</code>
macro&rdquo;, which is approximately useless since that covers the entire
class definition. This is obviously something we aim to improve
through <a href="https://github.com/rust-lang/rust/pull/40939">PRs like #40939</a>.</p>
<h3 id="conclusion">Conclusion</h3>
<p>Overall, I really enjoyed the sprint. It was great to meet so many
GNOME contributors in person. I was very impressed with how well
thought out the GNOME object system is.</p>
<p>Obviously, this macro is in its early days, but I&rsquo;m really excited
about its current state nonetheless. I think there is a lot of
potential for GNOME and Rust to have a truly seamless integration, and
I look forward to seeing it come together.</p>
<p>I don&rsquo;t know how much time I&rsquo;m going to have to devote to hacking on
the macro, but I plan to open up various issues on
<a href="https://github.com/nikomatsakis/gnome-class">the repository</a> over the next little while with various ideas
for expansions and/or design questions, so if you&rsquo;re interested in
seeing the work proceed, please get involved!</p>
<p>Finally, I want to take a moment to give a shoutout to jseyfried and
dtolnay, who have done excellent work pushing forward with procedural
macro support in rustc and the <code>quote!</code> libraries. Putting
<code>gobject_gen!</code> together was really an altogether pleasant
experience. I can&rsquo;t wait to see those APIs evolve more: support for
spans, first and foremost, but proper hygiene would be nice too, since
<code>gobject_gen!</code> has to generate various names as part of its mapping.</p>
<hr>
<h3 id="appendix-a-memory-layout-of-private-data">Appendix A: Memory layout of private data</h3>
<p>My understanding is that the private data feature evolved over
time. When the challenges around ABI compatibility were first
discovered, a convention developed of having each object have just a
single &ldquo;inline&rdquo; field. Each class would then malloc a separate struct
for its private fields. So you wound up with something like this:</p>
<pre tabindex="0"><code>object --&gt; +--------------------+
           | vtable             |
           | ------------------ |
           | SuperclassPrivate* | ---&gt; +-------------------+
           | ------------------ |      | superclass fields |
           | SubclassPrivate*   | --+  +-------------------+
           +--------------------+   |
                                    +--&gt; +-----------------+
                                         | subclass fields |
                                         +-----------------+
</code></pre><p>Naturally any class can now add private fields without changing the
offset of others&rsquo; fields. However, making multiple allocations per
object is inefficient, and it&rsquo;s easy to mess up the manual memory
management involved as well. So the GNOME runtime added the &ldquo;private&rdquo;
feature, which allows each class to request that some amount of
additional space be allocated, and provides an API for finding the
offset of that space from the main object. The exact memory layout is
(I presume) not defined, but as I understand it things are currently
laid out with the private data stored at a negative offset:</p>
<pre tabindex="0"><code>           +--------------------+
           | subclass fields    |
           | ------------------ |
           | superclass fields  |
object --&gt; + ------------------ +
           | vtable             |
           +--------------------+
</code></pre><p>Although no longer necessary, it is also still common to include a
single &ldquo;inline&rdquo; field that points to the private data, setup during
initialization time:</p>
<pre tabindex="0"><code>           +--------------------+ &lt;---+
           | subclass fields    |     |
           | ------------------ | &lt;-+ |
           | superclass fields  |   | |
object --&gt; + ------------------ +   | |
           | vtable             |   | |
           + ------------------ +   | |
           | SuperclassPrivate* | --+ |
           | ------------------ |     |
           | SubclassPrivate*   | ----+
           +--------------------+
</code></pre>]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/gnome" term="gnome" label="GNOME"/></entry><entry><title type="html">Unification in Chalk, part 2</title><link href="https://smallcultfollowing.com/babysteps/blog/2017/04/23/unification-in-chalk-part-2/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2017/04/23/unification-in-chalk-part-2/</id><published>2017-04-23T00:00:00+00:00</published><updated>2017-04-23T00:00:00+00:00</updated><content type="html"><![CDATA[<p>In my previous post, I talked over the basics of how
<a href="https://smallcultfollowing.com/babysteps/
/blog/2017/03/25/unification-in-chalk-part-1/">unification works</a> and showed how that &ldquo;mathematical version&rdquo; winds
up being expressed in chalk. I want to go a bit further now and extend
that base system to cover <a href="https://doc.rust-lang.org/nightly/book/second-edition/ch19-03-advanced-traits.html#associated-types">associated types</a>. These turn out to be a
pretty non-trival extension.</p>
<h3 id="what-is-an-associated-type">What is an associated type?</h3>
<p>If you&rsquo;re not a Rust programmer, you may not be familiar with the term
&ldquo;associated type&rdquo; (although many langages have equivalents). The basic
idea is that traits can have <strong>type members</strong> associated with them.  I
find the most intuitive example to be the <code>Iterator</code> trait, which has
an associated type <code>Item</code>. This type corresponds to kind of elements
that are produced by the iterator:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>As you can see in the <code>next()</code> method, to reference an associated
type, you use a kind of path &ndash; that is, when you write <code>Self::Item</code>,
it means &ldquo;the kind of <code>Item</code> that the iterator type <code>Self</code> produces&rdquo;.
I often refer to this as an <strong>associated type projection</strong>, since one
is &ldquo;projecting out&rdquo;<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> the type <code>Item</code>.</p>
<p>Let&rsquo;s look at an impl to make this more concrete. Consider
<a href="https://doc.rust-lang.org/std/vec/struct.IntoIter.html">the type <code>std::vec::IntoIter&lt;T&gt;</code></a>, which is one of the iterators
associated with a vector (specifically, the iterator you get when you
invoke <code>vec.into_iter()</code>). In that case, the elements yielded up by
the iterator are of type <code>T</code>, so we have an impl like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">IntoIter</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">T</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This means that if we have the type <code>IntoIter&lt;i32&gt;::Item</code>, that is
<strong>equivalent</strong> to the type <code>i32</code>. We usually call this process of
converting an associated trait projection (<code>IntoIter&lt;i32&gt;::Item</code>) into
the type found in the impl <strong>normalizing</strong> the type.</p>
<p>In fact, this <code>IntoIter&lt;i32&gt;::Item</code> is a kind of shorthand; in
particular, it didn&rsquo;t explicitly state what trait the type <code>Item</code> is
defined in (it&rsquo;s always possible that <code>IntoIter&lt;i32&gt;</code> implements more
than one trait that define an associated type called <code>Item</code>). To make
things fully explicit, then, one can use a <strong>fully qualified path</strong>
like this:</p>
<pre><code>&lt;IntoIter&lt;i32&gt; as Iterator&gt;::Item
 ^^^^^^^^^^^^^    ^^^^^^^^   ^^^^
 |                |          |
 |                |          Associated type name
 |                Trait
 Self type
</code></pre>
<p>I&rsquo;ll use these fully qualified paths from here on out to avoid confusion.</p>
<h3 id="integrating-associated-types-into-our-type-system">Integrating associated types into our type system</h3>
<p>In this post, we will extend our notion of types to include associated type projections:</p>
<pre tabindex="0"><code>T = ?X               // type variables
  | N&lt;T1, ..., Tn&gt;   // &#34;applicative&#34; types
  | P                // &#34;projection&#34; types   (new in this post)
P = &lt;T as Trait&gt;::X
</code></pre><p>Projection types are quite different from the existing &ldquo;applicative&rdquo;
types that we saw before. The reason is that they introduce a kind of
&ldquo;alias&rdquo; into the equality relationship. With just applicative types,
we could always make progress at each step: that is, no matter what
two types were being equated, we could always break the problem down
into simpler subproblems (or else error out). For example, if we had
<code>Vec&lt;?T&gt; = Vec&lt;i32&gt;</code>, we knew that this could <strong>only</strong> be true if <code>?T == i32</code>.</p>
<p>With associated type projections, this is not always true. Sometimes we
just can&rsquo;t make progress. Imagine, for example, this scenario:</p>
<pre><code>&lt;?X as Iterator&gt;::Item = i32
</code></pre>
<p>Here we know that <code>?X</code> is some kind of iterator that yields up <code>i32</code>
elements: but we have no way of knowing <em>which</em> iterator it is, there
are many possibilities. Similarly, imagine this:</p>
<pre><code>&lt;?X as Iterator&gt;::Item = &lt;T as Iterator&gt;::Item
</code></pre>
<p>Here we know that <code>?X</code> and <code>T</code> are both iterators that yield up the
same sort of items. But this doesn&rsquo;t tell us anything about the
relationship between <code>?X</code> and <code>T</code>.</p>
<h3 id="normalization-constraints">Normalization constraints</h3>
<p>To handle associated types, the basic idea is that we will introduce
<strong>normalization constraints</strong>, in addition to just having equality
constraints. A normalization constraint is written like this:</p>
<pre><code>&lt;IntoIter&lt;i32&gt; as Iterator&gt;::Item ==&gt; ?X   
</code></pre>
<p>This constraint says that the associated type projection
<code>&lt;IntoIter&lt;i32&gt; as Iterator&gt;::Item</code>, when <em>normalized</em>, should be
equal to <code>?X</code> (a type variable). As we will see in more detail in a
bit, we&rsquo;re going to then go and solve those normalizations, which
would eventually allow us to conclude that <code>?X = i32</code>.</p>
<p>(We could use the Rust syntax <code>IntoIter&lt;i32&gt;: Iterator&lt;Item=?X&gt;</code> for
this sort of constraint as well, but I&rsquo;ve found it to be more
confusing overall.)</p>
<p>Processing a normalization constraint is very simple to processing a
standard trait constraint. In fact, in chalk, they are literally the
same code. If <a href="https://smallcultfollowing.com/babysteps/
/blog/2017/01/26/lowering-rust-traits-to-logic/">you recall from my first Chalk post</a>, we can
lower impls into a series of clauses that express the trait that is
being implemented along with the values of its associated types. In
this case, if we look at the impl of <code>Iterator</code> for <a href="https://doc.rust-lang.org/std/vec/struct.IntoIter.html">the <code>IntoIter</code> type</a>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">IntoIter</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">T</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>We can translate this impl into a series of clauses sort of like this
(here, I&rsquo;ll use <a href="https://smallcultfollowing.com/babysteps/
/blog/2017/01/26/lowering-rust-traits-to-logic/#associated-types-and-type-equality">the notation I was using in my first post</a>):</p>
<pre tabindex="0"><code>// Define that `IntoIter&lt;T&gt;` implements `Iterator`,
// if `T` is `Sized` (the sized requirement is
// implicit in Rust syntax.)
Iterator(IntoIter&lt;T&gt;) :- Sized(T).

// Define that the `Item` for `IntoIter&lt;T&gt;`
// is `T` itself (but only if `IntoIter&lt;T&gt;`
// implements `Iterator`).
IteratorItem(IntoIter&lt;T&gt;, T) :- Iterator(IntoIter&lt;T&gt;).
</code></pre><p>So, to solve the normalization constraint <code>&lt;IntoIter&lt;i32&gt; as Iterator&gt;::Item ==&gt; ?X</code>, we translate that into the goal
<code>IteratorItem(IntoIter&lt;i32&gt;, ?X)</code>, and we try to prove that goal by
searching the applicable clauses. I sort of sketched out the procedure
<a href="https://smallcultfollowing.com/babysteps/
/blog/2017/01/26/lowering-rust-traits-to-logic/">in my first blog post</a>, but I&rsquo;ll present it in a bit more detail
here. The first step is to &ldquo;instantiate&rdquo; the clause by replacing
the variables (<code>T</code>, in this case) with fresh type variables.
This gives us a clause like:</p>
<pre><code>IteratorItem(IntoIter&lt;?T&gt;, ?T) :- Iterator(IntoIter&lt;?T&gt;).
</code></pre>
<p>Then we can unify the arguments of the clause with our goals, leading
to two unification equalities, and combine that with the conditions of the
clause itself, leading to three things we must prove:</p>
<pre><code>IntoIter&lt;?T&gt; = IntoIter&lt;i32&gt;
?T = ?X
Iterator(IntoIter&lt;?T)
</code></pre>
<p>Now we can recursively try to prove those things. To prove the
equalities, we apply the unification procedure we&rsquo;ve been looking
at. Processing the first equation, we can simplify because we have two
uses of <code>IntoIter</code> on both sides, so the type arguments must be equal:</p>
<pre><code>?T = i32 // changed this
?T = ?X
Iterator(IntoIter&lt;?T&gt;)
</code></pre>
<p>From there, we can deduce the value of <code>?T</code> and do some substitutions:</p>
<pre><code>i32 = ?X
Iterator(IntoIter&lt;i32&gt;)
</code></pre>
<p>We can now unify <code>?X</code> with i32, leaving us with:</p>
<pre><code>Iterator(IntoIter&lt;i32&gt;)
</code></pre>
<p>We can apply the clause <code>Iterator(IntoIter&lt;T&gt;) :- Sized(T)</code> using the same procedure now,
giving us two fresh goals:</p>
<pre><code>IntoIter&lt;i32&gt; = IntoIter&lt;?T&gt;
Sized&lt;?T&gt;
</code></pre>
<p>The first unification will yield (eventually):</p>
<pre><code>Sized&lt;i32&gt;
</code></pre>
<p>And we can prove this because this is a built-in rule for Rust (that is, that <code>i32</code> is sized).</p>
<h3 id="unification-as-just-another-goal-to-prove">Unification as just another goal to prove</h3>
<p>As you can see in the walk through in the previous section, in a lot
of ways, unification is &ldquo;just another goal to prove&rdquo;. That is, the
basic way that chalk functions is that it has a goal it is trying to
prove and, at each step, it tries to simplify that goal into
subgoals. Often this takes place by consulting the clauses that we
derived from impls (or that are builtin), but in the case of equality
goals, the subgoals are constructed by the builtin unification
algorithm.</p>
<p>In the <a href="https://smallcultfollowing.com/babysteps/
/blog/2017/03/25/unification-in-chalk-part-1/">previous post</a>, I gave <a href="http://smallcultfollowing.com/babysteps/blog/2017/03/25/unification-in-chalk-part-1/#how-this-is-implemented">various pointers</a> into the
implementation showing how the unification code looks &ldquo;for real&rdquo;.
I want to extend that explanation now to cover associated types.</p>
<p>The way I presented things in the previous section, unification
flattens its subgoals into the master list of goals. But in fact, for
efficiency, the unification procedure will typically eagerly process
its own subgoals. So e.g. when we transform <code>IntoIter&lt;i32&gt; = IntoIter&lt;?T&gt;</code>, we actually just
<a href="https://github.com/nikomatsakis/chalk/blob/6a7bb25402987421d93d02bda3f5d79bf878812c/src/solve/infer/unify.rs#L107-L109">invoke the code to equate their arguments immediately</a>.</p>
<p>The one exception to this is normalization goals. In that case, we
push the goals into
<a href="https://github.com/nikomatsakis/chalk/blob/6a7bb25402987421d93d02bda3f5d79bf878812c/src/solve/infer/unify.rs#L41">a separate list that is returned to the caller</a>. The
reason for this is that, sometimes, we can&rsquo;t make progress on one of
those goals immediately (e.g., if it has unresolved type variables, a
situation we&rsquo;ve not discussed in detail yet). The caller can throw it
onto a list of pending goals and come back to it later.</p>
<p>Here are the various cases of interest that we&rsquo;ve covered so far</p>
<ul>
<li><a href="https://github.com/nikomatsakis/chalk/blob/6a7bb25402987421d93d02bda3f5d79bf878812c/src/solve/infer/unify.rs#L115-L122">Equating a projection with a non-projection</a> will invoke <a href="https://github.com/nikomatsakis/chalk/blob/6a7bb25402987421d93d02bda3f5d79bf878812c/src/solve/infer/unify.rs#L161-L166"><code>unify_projection_ty</code></a> which just pushes a goal onto the output list. This covers both equating a type variable or an application type with a projection.</li>
<li><a href="https://github.com/nikomatsakis/chalk/blob/6a7bb25402987421d93d02bda3f5d79bf878812c/src/solve/infer/unify.rs#L111-L113">Equating two projections</a> will invoke <a href="https://github.com/nikomatsakis/chalk/blob/6a7bb25402987421d93d02bda3f5d79bf878812c/src/solve/infer/unify.rs#L161-L166"><code>unify_projection_tys</code></a> which creates the intermediate type variable. The reason for this is discussed shortly.</li>
</ul>
<h3 id="fallback-for-projection">Fallback for projection</h3>
<p>Thus far we showed how projection proceeds in the &ldquo;successful&rdquo; case,
where we manage to normalize a projection type into a simpler type (in
this case, <code>&lt;IntoIter&lt;i32&gt; as Iterator&gt;::Item</code> into <code>i32</code>). But
sometimes we want to work with generics we <em>can&rsquo;t</em> normalize the
projection any further. For example, consider this simple function,
which extracts the first item from a non-empty iterator (it panics if
the iterator <em>is</em> empty):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">first</span><span class="o">&lt;</span><span class="n">I</span>: <span class="nb">Iterator</span><span class="o">&gt;</span><span class="p">(</span><span class="n">iter</span>: <span class="nc">I</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">I</span>::<span class="n">Item</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">iter</span><span class="p">.</span><span class="n">next</span><span class="p">().</span><span class="n">expect</span><span class="p">(</span><span class="s">&#34;iterator should not be empty&#34;</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>What&rsquo;s interesting here is that we don&rsquo;t know what <code>I::Item</code> is. So imagine
we are given a normalization constraint like this one:</p>
<pre><code>&lt;I as Iterator&gt;::Item ==&gt; ?X
</code></pre>
<p>What type should we use for <code>?X</code> here? What chalk opts to do in cases
like this is to construct a sort a special &ldquo;applicative&rdquo; type
representing the associated item projection. I will write it as
<code>&lt;Iterator::Item&gt;&lt;I&gt;</code>, for now, but there is no real Rust syntax for
this.  It basically represents &ldquo;a projection that we could not
normalize further&rdquo;. You could consider it as a separate item in the
grammar for types, except that it&rsquo;s not really semantically different
from a projection; it&rsquo;s just a way for us to guide the chalk solver.</p>
<p>The way I think of it, there are two rules for proving that a
projection type is equal. The first one is that we can prove it via
normalization, as we&rsquo;ve already seen:</p>
<pre><code>IteratorItem(T, X)
-------------------------
&lt;T as Iterator&gt;::Item = X
</code></pre>
<p>The second is that we can prove it just by having all the <em>inputs</em> be equal:</p>
<pre><code>T = U
---------------------------------------------
&lt;T as Iterator&gt;::Item = &lt;U as Iterator&gt;::Item
</code></pre>
<p>We&rsquo;d prefer to use the normalization route, because it is more
flexible (i.e., it&rsquo;s sufficient for <code>T</code> and <code>U</code> to be equal, but not
necessary). But if we can definitively show that the normalization
route is impossible (i.e., we have no clauses that we can use to
normalize), then we we opt for this more restrictive route. The
special &ldquo;applicative&rdquo; type is a way for chalk to record (internally)
that for this projection, it opted for the more restrictive route,
because the first one was impossible.</p>
<p>(In general, we&rsquo;re starting to touch on Chalk&rsquo;s proof search strategy,
which is rather different from Prolog, but beyond the scope of this
particular blog post.)</p>
<h3 id="some-examples-of-the-fallback-in-action">Some examples of the fallback in action</h3>
<p>In the <code>first()</code> function we saw before, we will wind up computing
the result type of <code>next()</code> as <code>&lt;I as Iterator&gt;::Item</code>. This will be
returned, so at some point we will want to prove that this type
is equal to the return type of the function (actually, we want to prove
subtyping, but for this particular type those are the same thing, so I&rsquo;ll
gloss over that for now). This corresponds to a goal like the following
(here I am using <a href="http://smallcultfollowing.com/babysteps/blog/2017/01/26/lowering-rust-traits-to-logic/#type-checking-generic-functions-beyond-horn-clauses">the notation I discussed in my first post for universal
quantification etc</a>):</p>
<pre><code>forall&lt;I&gt; {
    if (Iterator(I)) {
        &lt;I as Iterator&gt;::Item = &lt;I as Iterator&gt;::Item
    }
}
</code></pre>
<p>Per the rules we gave earlier, we will process this constraint by introducing
a fresh type variable and normalizing both sides to the same thing:</p>
<pre><code>forall&lt;I&gt; {
    if (Iterator(I)) {
        exists&lt;?T&gt; {
            &lt;I as Iterator&gt;::Item ==&gt; ?T,
            &lt;I as Iterator&gt;::Item ==&gt; ?T,
        }
    }
}
</code></pre>
<p>In this case, both constraints will wind up resulting in <code>?T</code> being
the special applicative type <code>&lt;Iterator::Item&gt;&lt;I&gt;</code>, so everything
works out successfully.</p>
<p>Let&rsquo;s briefly look at an illegal function and see what happens here.
In this case, we have two iterator types (<code>I</code> and <code>J</code>) and we&rsquo;ve
used the wrong one in the return type:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">first</span><span class="o">&lt;</span><span class="n">I</span>: <span class="nb">Iterator</span><span class="p">,</span><span class="w"> </span><span class="n">J</span>: <span class="nb">Iterator</span><span class="o">&gt;</span><span class="p">(</span><span class="n">iter_i</span>: <span class="nc">I</span><span class="p">,</span><span class="w"> </span><span class="n">iter_j</span>: <span class="nc">J</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">J</span>::<span class="n">Item</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">iter_i</span><span class="p">.</span><span class="n">next</span><span class="p">().</span><span class="n">expect</span><span class="p">(</span><span class="s">&#34;iterator should not be empty&#34;</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This will result in a goal like:</p>
<pre><code>forall&lt;I, J&gt; {
    if (Iterator(I), Iterator(J)) {
        &lt;I as Iterator&gt;::Item = &lt;J as Iterator&gt;::Item
    }
}
</code></pre>
<p>Which will again be normalized and transformed as follows:</p>
<pre><code>forall&lt;I, J&gt; {
    if (Iterator(I), Iterator(J)) {
        exists&lt;?T&gt; {
            &lt;I as Iterator&gt;::Item ==&gt; ?T,
            &lt;J as Iterator&gt;::Item ==&gt; ?T,
        }
    }
}
</code></pre>
<p>Here, the difference is that normalizing <code>&lt;I as Iterator&gt;::Item</code> results in
<code>&lt;Iterator::Item&gt;&lt;I&gt;</code>, but normalizing <code>&lt;J as Iterator&gt;::Item</code> results in
<code>&lt;Iterator::Item&gt;&lt;J&gt;</code>. Since both of those are equated with <code>?T</code>, we will
ultimately wind up with a unification problem like:</p>
<pre><code>forall&lt;I, J&gt; {
    if (Iterator(I), Iterator(J)) {
        &lt;Iterator::Item&gt;&lt;I&gt; = &lt;Iterator::Item&gt;&lt;J&gt;
    }
}
</code></pre>
<p>Following our usual rules, we can handle the equality of two
applicative types by equating their arguments, so after that we get
<code>forall&lt;I, J&gt; I = J</code> &ndash; and this clearly cannot be proven. So we get
an error.</p>
<h3 id="termination-after-a-fashion">Termination, after a fashion</h3>
<p>One final note, on termination. We do not, in general, guarantee
termination of the unification process once associated types are
involved. <a href="https://sdleffler.github.io/RustTypeSystemTuringComplete/">Rust&rsquo;s trait matching is turing complete</a>, after all.
However, we <em>do</em> wish to ensure that our own unification algorithms
don&rsquo;t introduce problems of their own!</p>
<p>The non-projection parts of unification have a pretty clear argument
for termination: each time we remove a constraint, we replace it with
(at most) simpler constraints that were all embedded in the original
constraint.  So types keep getting smaller, and since they are not
infinite, we must stop sometime.</p>
<p>This argument is not sufficient for projections. After all, we replace
a constraint like <code>&lt;T as Iterator&gt;::Item = U</code> with an equivalent
normalization constraint, where all the types are the same:</p>
<pre><code>&lt;T as Iterator&gt;::Item ==&gt; U
</code></pre>
<p>The argument for termination then is that normalization, if it
terminates, will unify <code>U</code> with an applicative type. Moreover, we only
instantiate type variables with normalized types. Now, these
applicative types might be the special applicative types that Chalk
uses internally (e.g., <code>&lt;IteratorItem&gt;&lt;T&gt;</code>), but it&rsquo;s an applicative
type nontheless. When that <em>applicative</em> type is processed later, it
will therefore be broken down into smaller pieces (per the prior
argument). That&rsquo;s the rough idea, anyway.</p>
<h3 id="contrast-with-rustc">Contrast with rustc</h3>
<p>I tend to call the normalization scheme that chalk uses <strong>lazy</strong>
normalization.  This is because we don&rsquo;t normalize until we are
actually equating a projection with some other type. In constrast,
rustc uses an <strong>eager</strong> strategy, where we normalize types as soon as
we &ldquo;instantiate&rdquo; them (e.g., when we took a clause and replaced its
type parameters with fresh type variables).</p>
<p>The eager strategy has a number of downsides, not the least of which
that it is very easy to forget to normalize something when you were
supposed to (and sometimes you wind up with a mix of normalized and
unnormalized things).</p>
<p>In rustc, we only have one way to represent projections (i.e., we
don&rsquo;t distinguish the &ldquo;projection&rdquo; and &ldquo;applicative&rdquo; version of
<code>&lt;Iterator::Item&gt;&lt;T&gt;</code>). The distinction between an unnormalized <code>&lt;T as Iterator&gt;::Item</code> and one that we failed to normalize further is made
simply by knowing (in the code) whether we&rsquo;ve tried to normalize the
type in question or not &ndash; the unification routines, in particular,
always assume that a projection type implies that normalization
wouldn&rsquo;t succeed.</p>
<h3 id="a-note-on-terminology">A note on terminology</h3>
<p>I&rsquo;m not especially happy with the &ldquo;projection&rdquo; and &ldquo;applicative&rdquo;
terminology I&rsquo;ve been using. Its&rsquo;s what Chalk uses, but it&rsquo;s kind of
nonsense &ndash; for example, both <code>&lt;T as Iterator&gt;::Item</code> and <code>Vec&lt;T&gt;</code> are
&ldquo;applications&rdquo; of a type function, from a certain perspective. I&rsquo;m not
sure what&rsquo;s a better choice though. Perhaps just &ldquo;unnormalized&rdquo; and
&ldquo;normalized&rdquo; (with types like <code>Vec&lt;T&gt;</code> always being immediately
considered normalized). Suggestions welcome.</p>
<h3 id="conclusion">Conclusion</h3>
<p>I&rsquo;ve sketched out how associated type normalization works in chalk and
how it compares to rustc. I&rsquo;d like to change rustc over to this
strategy, and plan to open up an issue soon describing a
strategy. I&rsquo;ll post a link to it in the [internals comment thread]
once I do.</p>
<p>There are other interesting directions we could go with associated
type equality. For example, I was pursuing for some time a strategy
based on congruence closure, and even implemented (in <a href="https://crates.io/crates/ena">ena</a>)
<a href="http://www.alice.virginia.edu/~weimer/2011-6610/reading/nelson-oppen-congruence.pdf">an extended version of the algorithm described here</a>. However,
I&rsquo;ve not been able to figure out how to combine congruence closure
with things like implication goals &ndash; it seems to get quite
complicated. I understand that there are papers tackling this topic
(e.g, <a href="https://arxiv.org/pdf/1701.04391.pdf">Selsam and de Moura</a>), but haven&rsquo;t yet had time to read
it.</p>
<h3 id="comments">Comments?</h3>
<p>I&rsquo;ll be monitoring <a href="https://internals.rust-lang.org/t/blog-series-lowering-rust-traits-to-logic/4673">the internals thread</a> for comments and discussion. =)</p>
<h3 id="footnotes">Footnotes</h3>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Projection is a very common bit of jargon in PL circles, though it typically refers to accessing a field, not a type. As far as I can tell, no mainstream programmer uses it. Ah well, I&rsquo;m not aware of a good replacement.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/traits" term="traits" label="Traits"/><category scheme="https://smallcultfollowing.com/babysteps/categories/chalk" term="chalk" label="Chalk"/><category scheme="https://smallcultfollowing.com/babysteps/categories/pl" term="pl" label="PL"/></entry><entry><title type="html">Rayon 0.7 released</title><link href="https://smallcultfollowing.com/babysteps/blog/2017/04/06/rayon-0-7-released/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2017/04/06/rayon-0-7-released/</id><published>2017-04-06T00:00:00+00:00</published><updated>2017-04-06T00:00:00+00:00</updated><content type="html"><![CDATA[<p>We just released Rayon 0.7. This is a pretty exciting release, because
it marks the official first step towards Rayon 1.0. In addition, it
marks the first release where Rayon&rsquo;s parallel iterators reach
&ldquo;feature parity&rdquo; with the standard sequential iterators! To mark the
moment, I thought I&rsquo;d post the release notes here on the blog:</p>
<hr>
<p>This release marks the first step towards Rayon 1.0. <strong>For best
performance, it is important that all Rayon users update to at least
Rayon 0.7.</strong> This is because, as of Rayon 0.7, we have taken steps to
ensure that, no matter how many versions of rayon are actively in use,
there will only be a single global scheduler. This is achieved via the
<code>rayon-core</code> crate, which is being released at version 1.0, and which
encapsulates the core schedule APIs like <code>join()</code>. (Note: the
<code>rayon-core</code> crate is, to some degree, an implementation detail, and
not intended to be imported directly; it&rsquo;s entire API surface is
mirrored through the rayon crate.)</p>
<p>We have also done a lot of work reorganizing the API for Rayon 0.7 in
preparation for 1.0. The names of iterator types have been changed and
reorganized (but few users are expected to be naming those types
explicitly anyhow). In addition, a number of parallel iterator methods
have been adjusted to match those in the standard iterator traits more
closely. See the &ldquo;Breaking Changes&rdquo; section below for
details.</p>
<p>Finally, Rayon 0.7 includes a number of new features and new parallel
iterator methods. <strong>As of this release, Rayon&rsquo;s parallel iterators
have officially reached parity with sequential iterators</strong> &ndash; that is,
every sequential iterator method that makes any sense in parallel is
supported in some capacity.</p>
<h3 id="new-features-and-methods">New features and methods</h3>
<ul>
<li>The internal <code>Producer</code> trait now features <code>fold_with</code>, which enables
better performance for some parallel iterators.</li>
<li>Strings now support <code>par_split()</code> and <code>par_split_whitespace()</code>.</li>
<li>The <code>Configuration</code> API is expanded and simplified:
<ul>
<li><code>num_threads(0)</code> no longer triggers an error</li>
<li>you can now supply a closure to name the Rayon threads that get created
by using <code>Configuration::thread_name</code>.</li>
<li>you can now inject code when Rayon threads start up and finish</li>
<li>you can now set a custom panic handler to handle panics in various odd situations</li>
</ul>
</li>
<li>Threadpools are now able to more gracefully put threads to sleep when not needed.</li>
<li>Parallel iterators now support <code>find_first()</code>, <code>find_last()</code>, <code>position_first()</code>,
and <code>position_last()</code>.</li>
<li>Parallel iterators now support <code>rev()</code>, which primarily affects subsequent calls
to <code>enumerate()</code>.</li>
<li>The <code>scope()</code> API is now considered stable (and part of <code>rayon-core</code>).</li>
<li>There is now a useful <code>rayon::split</code> function for creating custom
Rayon parallel iterators.</li>
<li>Parallel iterators now allow you to customize the min/max number of
items to be processed in a given thread. This mechanism replaces the
older <code>weight</code> mechanism, which is deprecated.</li>
<li><code>sum()</code> and friends now use the standard <code>Sum</code> traits</li>
</ul>
<h3 id="breaking-changes">Breaking changes</h3>
<p>In the move towards 1.0, there have been a number of minor breaking changes:</p>
<ul>
<li>Configuration setters like <code>Configuration::set_num_threads()</code> lost the <code>set_</code> prefix,
and hence become something like <code>Configuration::num_threads()</code>.</li>
<li><code>Configuration</code> getters are removed</li>
<li>Iterator types have been shuffled around and exposed more consistently:
<ul>
<li>combinator types live in <code>rayon::iter</code>, e.g. <code>rayon::iter::Filter</code></li>
<li>iterators over various types live in a module named after their type,
e.g. <code>rayon::slice::Windows</code></li>
</ul>
</li>
<li>When doing a <code>sum()</code> or <code>product()</code>, type annotations are needed for the result
since it is now possible to have the resulting sum be of a type other than the value
you are iterating over (this mirrors sequential iterators).</li>
</ul>
<h3 id="experimental-features">Experimental features</h3>
<p>Experimental features require the use of the <code>unstable</code> feature. Their
APIs may change or disappear entirely in future releases (even minor
releases) and hence they should be avoided for production code.</p>
<ul>
<li>We now have (unstable) support for futures integration. You can use
<code>Scope::spawn_future</code> or <code>rayon::spawn_future_async()</code>.</li>
<li>There is now a <code>rayon::spawn_async()</code> function for using the Rayon
threadpool to run tasks that do not have references to the stack.</li>
</ul>
<h3 id="contributors">Contributors</h3>
<p>Thanks to the following people for their contributions to this release:</p>
<ul>
<li>@Aaronepower</li>
<li>@ChristopherDavenport</li>
<li>@bluss</li>
<li>@cuviper</li>
<li>@froydnj</li>
<li>@gaurikholkar</li>
<li>@hniksic</li>
<li>@leodasvacas</li>
<li>@leshow</li>
<li>@martinhath</li>
<li>@mbrubeck</li>
<li>@nikomatsakis</li>
<li>@pegomes</li>
<li>@schuster</li>
<li>@torkleyy</li>
</ul>
]]></content></entry><entry><title type="html">Unification in Chalk, part 1</title><link href="https://smallcultfollowing.com/babysteps/blog/2017/03/25/unification-in-chalk-part-1/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2017/03/25/unification-in-chalk-part-1/</id><published>2017-03-25T00:00:00+00:00</published><updated>2017-03-25T00:00:00+00:00</updated><content type="html"><![CDATA[<p>So in <a href="https://smallcultfollowing.com/babysteps/
/blog/2017/01/26/lowering-rust-traits-to-logic/">my first post</a> on <a href="https://github.com/nikomatsakis/chalk/">chalk</a>, I mentioned that unification and
normalization of associated types were interesting topics. I&rsquo;m going
to write a two-part blog post series covering that.  This first part
begins with an overview of how ordinary type unification works during
compilation. The next post will add in associated types and we can see
what kinds of mischief they bring with them.</p>
<h3 id="what-is-unification">What is unification?</h3>
<p>Let&rsquo;s start with a brief overview of what unification is. When you are
doing type-checking or trait-checking, it often happens that you wind
up with types that you don&rsquo;t know yet. For example, the user might
write <code>None</code> &ndash; you know that this has type <code>Option&lt;T&gt;</code>, but you don&rsquo;t
know what that type <code>T</code> is. To handle this, the compiler will create a
<strong>type variable</strong>. This basically represents an unknown,
to-be-determined type. To denote this, I&rsquo;ll write <code>Option&lt;?T&gt;</code>, where
the leading question mark indicates a variable.</p>
<p>The idea then is that as we go about type-checking we will later find
out some constraints that tell us what <code>?T</code> has to be. For example,
imagine that we know that <code>Option&lt;?T&gt;</code> must implement <code>Foo</code>, and we
have a trait <code>Foo</code> that is implemented only for <code>Option&lt;String&gt;</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Option</span><span class="o">&lt;</span><span class="nb">String</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>In order for this impl to apply, it must be the case that the self
types are <strong>equal</strong>, i.e., the same type. (Note that trait matching
never considers subtyping.) We write this as a constraint:</p>
<pre><code>Option&lt;?T&gt; = Option&lt;String&gt;
</code></pre>
<p>Now you can probably see where this is going. Eventually, we&rsquo;re going
to figure out that <code>?T</code> must be <code>String</code>. But it&rsquo;s not <strong>immediately</strong>
obvious &ndash; all we see right now is that two <code>Option</code> types have to be
equal. In particular, we don&rsquo;t yet have a simple constraint like <code>?T = String</code>. To arrive at that, we have to do <strong>unification</strong>.</p>
<h3 id="basic-unification">Basic unification</h3>
<p>So, to restate the previous section in mildly more formal terms, the
idea with unification is that we have:</p>
<ul>
<li>a bunch of <strong>type variables</strong> like <code>?T</code>. We often call these
<strong>existential type variables</strong> because, when you look at things in a
logical setting, they arise from asking questions like <code>exists ?T. (Option&lt;String&gt; = Option&lt;?T&gt;)</code> &ndash; i.e., does there exist a type
<code>?T</code> that can make <code>Option&lt;String&gt;</code> equal to
<code>Option&lt;?T&gt;</code>.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></li>
<li>a bunch of <strong>unification constraints</strong> <code>U1..Un</code> like <code>T1 = T2</code>, where <code>T1</code>
and <code>T2</code> are types.  These are equalities that we know have to be
true.</li>
</ul>
<p>We would like to process these unification constraints and get to
one of two outcomes:</p>
<ul>
<li>the unification cannot be solved (e.g., <code>u32 = i32</code> just can&rsquo;t be true);</li>
<li>we&rsquo;ve got a <strong>substitution</strong> (mapping) from type variables to their
values (e.g., <code>?T =&gt; String</code>) that makes all of the unification
constraints hold.</li>
</ul>
<p>Let&rsquo;s start out with a really simple type system where we only have
two kinds of types (in particular, we don&rsquo;t yet have associated types):</p>
<pre tabindex="0"><code>T = ?X             // type variables
  | N&lt;T1, ..., Tn&gt; // &#34;applicative&#34; types
</code></pre><p>The first kind of type is type variables, as we&rsquo;ve seen. The second
kind of type I am calling &ldquo;applicative&rdquo; types, which is really not a
great name, but that&rsquo;s what I called it in chalk for whatever reason.
Anyway they correspond to types like <code>Option&lt;T&gt;</code>, <code>Vec&lt;T&gt;</code>, and even
types like <code>i32</code>. Here the name <code>N</code> is the <strong>name</strong> of the type (i.e.,
<code>Option</code>, <code>Vec</code>, <code>i32</code>) and the type parameters <code>T1...Tn</code> represent
the type parameters of the type. Note that there may be zero of them
(as is the case for <code>i32</code>, which is kind of &ldquo;shorthand&rdquo; for <code>i32&lt;&gt;</code>).</p>
<p>So the idea for unification then is that we start out with an empty
substitution <code>S</code> and we have this list of unification constraints
<code>U1..Un</code>. We want to pop off the first constraint (<code>U1</code>) and figure
out what to do based on what category it falls into. At each step, we
may update our substitution <code>S</code> (i.e., we may figure out the value of
a variable). In that case, we&rsquo;ll replace the variable with its value
for all the later steps. Other times, we&rsquo;ll create new, simpler
unification problems.</p>
<ul>
<li><code>?X = ?Y</code> &ndash; if <code>U</code> equates two variables together, we can replace
one variable with the other, so we add <code>?X =&gt; ?Y</code> to our
substitution, and then we replace all remaining uses of <code>?X</code> with
<code>?Y</code>.</li>
<li><code>?X = N&lt;T1..Tn&gt;</code> &ndash; if we see a type variable equated with an
applicative type, we can add <code>?X =&gt; N&lt;T1..Tn&gt;</code> to our substitution
(and replace all uses of it). But there is catch &ndash; we have to do
one check first, called the <strong>occurs check</strong>, which I&rsquo;ll describe
later on.</li>
<li><code>N&lt;X1..Xn&gt; = N&lt;Y1..Yn&gt;</code> &ndash; if we see two applicative types with the
same name being equated, we can convert that into a bunch of smaller
unification problems like <code>X1 = Y1</code>, <code>X2 = Y2</code>, &hellip;, <code>Xn = Yn</code>. The
idea here is that <code>Option&lt;Foo&gt; = Option&lt;Bar&gt;</code> is true if <code>Foo = Bar</code> is
true; so we can convert the bigger problem into the smaller one, and
then forget about the bigger one.</li>
<li><code>N&lt;...&gt; = M&lt;...&gt; where N != M</code> &ndash; if we see two application
types being equated, but their names are different, that&rsquo;s just an
error. This would be something like <code>Option&lt;T&gt; = Vec&lt;T&gt;</code>.</li>
</ul>
<p>OK, let&rsquo;s try to apply those rules to our example. Remember that we
had one variable (<code>?T</code>) and one unification problem (<code>Option&lt;?T&gt; = Option&lt;String&gt;</code>). We start an initial state like this:</p>
<pre><code>S = [] // empty substitution
U = [Option&lt;?T&gt; = Option&lt;String&gt;] // one constraint
</code></pre>
<p>The head constraint consists of two applicative types with the same
name (<code>Option</code>), so we can convert that into a simpler equation,
reaching this state:</p>
<pre><code>S = [] // empty substitution
U = [?T = String] // one constraint
</code></pre>
<p>Now the next constraint is of the kind <code>?T = String</code>, so we can update
our substitution. In this case, there are no more constraints, but if
there were, we would replace any uses of <code>?T</code> in those constraints
with `String:</p>
<pre><code>S = [?T =&gt; String] // empty substitution
U = [] // zero constraints
</code></pre>
<p>Since there are no more constraints left, we&rsquo;re done! We found a
solution.</p>
<p>Let&rsquo;s do another example. This one is a bit more interesting.
Imagine that we had two variables (<code>?T</code> and <code>?U</code>) and this
initial state:</p>
<pre><code>S = []
U = [(?T, u32) = (i32, ?U),
     Option&lt;?T&gt; = Option&lt;?U&gt;]
</code></pre>
<p>The first constraint is unifying two tuples &ndash; you can think of a
tuple as an applicative type, so <code>(?T, u32)</code> is kind of like
<code>Tuple2&lt;?T, u32&gt;</code>. Hence, we will simplify the first equation
into two smaller ones:</p>
<pre><code>// After unifiying (?T, u32) = (i32, ?U)
S = []
U = [?T = i32,
     ?U = u32,
     Option&lt;?T&gt; = Option&lt;?U&gt;]
</code></pre>
<p>To process the next equation <code>?T = i32</code>, we just update the
substitution. We also replace <code>?T</code> in the remaining problems
with <code>i32</code>, leaving us with this state:</p>
<pre><code>// After unifiying ?T = i32
S = [?T =&gt; i32]
U = [?U = u32,
     Option&lt;i32&gt; = Option&lt;?U&gt;]
</code></pre>
<p>We can do the same for <code>?U</code>:</p>
<pre><code>// After unifiying ?U = u32
S = [?T =&gt; i32, ?U = u32]
U = [Option&lt;i32&gt; = Option&lt;u32&gt;]
</code></pre>
<p>Now we, as humans, see that this problem is going to wind up
with an error, but the compiler isn&rsquo;t that smart yet. It has
to first break down the remaining unification problem by
one more step:</p>
<pre><code>// After unifiying Option&lt;i32&gt; = Option&lt;u32&gt;
S = [?T =&gt; i32, ?U = u32]
U = [i32 = u32]             // --&gt; Error!
</code></pre>
<p>And now we get an error, because we have two applicative types with
different names (<code>i32</code> vs <code>u32</code>).</p>
<h3 id="the-occurs-check-preventing-infinite-types">The occurs check: preventing infinite types</h3>
<p>When describing the unification procedure, I left out one little bit,
but it is kind of important. When we have a unification constraint
like <code>?X = T</code> for some type <code>T</code>, we can&rsquo;t just <strong>immediately</strong> add <code>?X =&gt; T</code> to our substitution. We have to first check and make sure that
<code>?X</code> does not appear in <code>T</code>; if it does, that&rsquo;s also an error. In
other words, we would consider a unification constraint like this to
be illegal:</p>
<pre><code>?X = Option&lt;?X&gt;
</code></pre>
<p>The problem here is that this results in an infinitely big type. And I
don&rsquo;t mean a type that occupies an infinite amount of RAM on your
computer (although that may be true). I mean a type that I can&rsquo;t even
write down. Like if I tried to write down a type that satisfies this
inequality, it would look like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="nb">Option</span><span class="o">&lt;</span><span class="nb">Option</span><span class="o">&lt;</span><span class="nb">Option</span><span class="o">&lt;</span><span class="nb">Option</span><span class="o">&lt;</span><span class="w"> </span><span class="cm">/* ad infinitum */</span><span class="w"> </span><span class="o">&gt;&gt;&gt;&gt;</span><span class="w">
</span></span></span></code></pre></div><p>We don&rsquo;t want types like that, they cause all manner of mischief
(think non-terminating compilations). We already know that no such
type arises from our input program (because it has finite size, and it
contains all the types in textual form). But they can arise through
inference if we&rsquo;re not careful. So we prevent them by saying that
whenever we unify a variable <code>?X</code> with some value <code>T</code>, then <code>?X</code>
cannot <strong>occur</strong> in <code>T</code> (hence the name &ldquo;occurs check&rdquo;).</p>
<p>Here is an example Rust program where this could arise:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">x</span><span class="p">;</span><span class="w">    </span><span class="c1">// x has type ?X
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">None</span><span class="p">;</span><span class="w">     </span><span class="c1">// adds constraint: ?X = Option&lt;?Y&gt;
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">x</span><span class="p">);</span><span class="w">  </span><span class="c1">// adds constraint: ?X = Option&lt;?X&gt;
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>And indeed if you
<a href="https://is.gd/pc0D6E">try this example on the playpen</a>, you will get
&ldquo;cyclic type of infinite size&rdquo; as an error.</p>
<h3 id="how-this-is-implemented">How this is implemented</h3>
<p>In terms of how this algorithm is typically <strong>implemented</strong>, it&rsquo;s
quite a bit different than how I presented it here. For example, the
&ldquo;substitution&rdquo; is usually implemented through a mutable unification
table, which uses <a href="https://en.wikipedia.org/wiki/Disjoint-set_data_structure">Tarjan&rsquo;s Union-Find algorithm</a> (there are a
<a href="https://crates.io/search?q=union%20find">number of implementations</a> available on crates.io); the set
of unification constraints is not necessarily created as an explicit
vector, but just through recursive calls to a <code>unify</code> procedure.  The
relevant code in chalk, if you are curious, can be
<a href="https://github.com/nikomatsakis/chalk/blob/6a7bb25402987421d93d02bda3f5d79bf878812c/src/solve/infer/unify.rs">found here</a>.</p>
<p>The
<a href="https://github.com/nikomatsakis/chalk/blob/6a7bb25402987421d93d02bda3f5d79bf878812c/src/solve/infer/unify.rs#L69">main procedure is <code>unify_ty_ty</code></a>,
which unifies two types. It
<a href="https://github.com/nikomatsakis/chalk/blob/6a7bb25402987421d93d02bda3f5d79bf878812c/src/solve/infer/unify.rs#L71-L75">begins by normalizing them</a>,
which corresponds to applying the substitution that we have built up
so far. It then analyzes the various cases in roughly the way we&rsquo;ve
described (ignoring the cases we haven&rsquo;t talked about yet, like
higher-ranked types or associated types):</p>
<ul>
<li><a href="https://github.com/nikomatsakis/chalk/blob/6a7bb25402987421d93d02bda3f5d79bf878812c/src/solve/infer/unify.rs#L81">Equating two variables unifies the variables.</a>
You see that <a href="https://github.com/nikomatsakis/chalk/blob/6a7bb25402987421d93d02bda3f5d79bf878812c/src/solve/infer/unify.rs#L87">updating the unification table</a> corresponds to modifying
our substitution.</li>
<li><a href="https://github.com/nikomatsakis/chalk/blob/6a7bb25402987421d93d02bda3f5d79bf878812c/src/solve/infer/unify.rs#L91-L92">Equating a variable and an applicative type</a>
does the
<a href="https://github.com/nikomatsakis/chalk/blob/6a7bb25402987421d93d02bda3f5d79bf878812c/src/solve/infer/unify.rs#L207">&ldquo;occurs check&rdquo;</a>
and
<a href="https://github.com/nikomatsakis/chalk/blob/6a7bb25402987421d93d02bda3f5d79bf878812c/src/solve/infer/unify.rs#L209">updates the unification table</a>.</li>
<li><a href="https://github.com/nikomatsakis/chalk/blob/6a7bb25402987421d93d02bda3f5d79bf878812c/src/solve/infer/unify.rs#L107">Equating two applicative type recursively equates their arguments</a> (in this case by using the <a href="https://github.com/nikomatsakis/chalk/blob/master/src/zip.rs#L25-L27">helper trait <code>Zip</code></a>).</li>
</ul>
<p>(Note: these links are fixed to the head commit in chalk as of the
time of this writing; that code may be quite out of date by the time
you read this, of course.)</p>
<h3 id="conclusion">Conclusion</h3>
<p>This post describes how basic unification works. The unification
algorithm roughly as I presented it was first introduced by
<a href="http://dl.acm.org/citation.cfm?id=321253">Robinson</a>, I believe, and it forms the heart of
<a href="https://en.wikipedia.org/wiki/Hindley%E2%80%93Milner_type_system">Hindley-Milner type inference</a> (used in ML, Haskell, and Rust as
well) &ndash; as such, I&rsquo;m sure there are tons of other blog posts covering
the same material better, but oh well.</p>
<p>In the next post, I&rsquo;ll talk about how I chose to extend this basic
system to cover associated types. Other interesting topics I would
like to cover include:</p>
<ul>
<li>integrating subtyping and lifetimes;</li>
<li>how to handle generics (in particular, universal quantification like <code>forall</code>);</li>
<li>why it is decidedly non-trivial to integrate add where-clauses like
<code>where T = i32</code> into Rust (it breaks some assumptions that we made
in this post, in particular).</li>
</ul>
<h3 id="comments">Comments</h3>
<p>Post any comments or questions in
<a href="https://internals.rust-lang.org/t/blog-series-lowering-rust-traits-to-logic/4673">this internals thread</a>.</p>
<h3 id="footnotes">Footnotes</h3>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Later on, probably not in this post, we&rsquo;ll see universal type variables (i.e., <code>forall !T</code>); if you&rsquo;re interested in reading up on how they interact with inference, I recommend [&ldquo;A Proof Procedure for the Logic of Hereditary Harrop Formulas&rdquo;, by Gopalan Nadathur][pphhf], which has a very concrete explanation.
[pphhf]: <a href="http://dl.acm.org/citation.cfm?id=868380">http://dl.acm.org/citation.cfm?id=868380</a>&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/traits" term="traits" label="Traits"/><category scheme="https://smallcultfollowing.com/babysteps/categories/chalk" term="chalk" label="Chalk"/><category scheme="https://smallcultfollowing.com/babysteps/categories/pl" term="pl" label="PL"/></entry><entry><title type="html">The Lane Table algorithm</title><link href="https://smallcultfollowing.com/babysteps/blog/2017/03/17/the-lane-table-algorithm/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2017/03/17/the-lane-table-algorithm/</id><published>2017-03-17T00:00:00+00:00</published><updated>2017-03-17T00:00:00+00:00</updated><content type="html"><![CDATA[<p>For some time now I&rsquo;ve been interested in better ways to construct
LR(1) parsers. LALRPOP currently allows users to choose between the
full LR(1) algorithm or the LALR(1) subset. Neither of these choices
is very satisfying:</p>
<ul>
<li>the full LR(1) algorithm gives pretty intuitive results but produces
a lot of states; my hypothesis was that, with modern computers, this
wouldn&rsquo;t matter anymore. This is sort of true &ndash; e.g., I&rsquo;m able to
generate and process even the <a href="https://github.com/nikomatsakis/rustypop">full Rust grammar</a> &ndash; but this
results in a <strong>ton</strong> of generated code.</li>
<li>the LALR(1) subset often works but sometimes mysteriously fails with
indecipherable errors. This is because it is basically a hack that
conflates states in the parsing table according to a heuristic; when
this heuristic fails, you get strange results.</li>
</ul>
<p>The <a href="http://cssauh.com/xc/pub/LaneTable_APPLC12.pdf">Lane Table algorithm</a> published by Pager and Chen at APPLC
&lsquo;12 offers an interesting alternative. It is an alternative to earlier
work by Pager, the &ldquo;lane tracing&rdquo; algorithm and practical general
method. In any case, the goal is to generate an LALR(1) state machine
<em>when possible</em> and gracefully scale up to the full LR(1) state
machine <em>as needed</em>.</p>
<p>I found the approach appealing, as it seemed fairly simple, and also
seemed to match what I would try to do
intuitively. <a href="https://github.com/nikomatsakis/lalrpop/blob/27e154f07230ee35268bd014a6bc499d840c80b2/lalrpop/src/lr1/lane_table/README.md">I&rsquo;ve been experimenting with the Lane Table algorithm in LALRPOP and I now have a simple prototype that seems to work.</a>
Implementing it required that I cover various cases that the paper
left implicit, and the aim of this blog post is to describe what I&rsquo;ve
done so far. I do not claim that this description is what the authors
originally intended; for all I know, it has some bugs, and I certainly
think it can be made more efficient.</p>
<p>My explanation is intended to be widely readable, though I do assume
some familiarity with the basic workings of an LR-parser (i.e., that
we shift states onto a stack, execute reductions, etc). But I&rsquo;ll
review the bits of table construction that you need.</p>
<h3 id="first-example-grammar-g0">First example grammar: G0</h3>
<p>To explain the algorithm, I&rsquo;m going to walk through two example
grammars. The first I call G0 &ndash; it is a reduced version of what the
paper calls G1. It is interesting because it does not require
splitting any states, and so we wind up with the same number of states
as in LR(0). Put another way, it is an LALR(1) grammar.</p>
<p>I will be assuming a basic familiarity with the LR(0) and LR(1) state
construction.</p>
<h4 id="grammar-g0">Grammar G0</h4>
<pre tabindex="0"><code>G0 = X &#34;c&#34;
   | Y &#34;d&#34;
X  = &#34;e&#34; X
   | &#34;e&#34;
Y  = &#34;e&#34; Y
   | &#34;e&#34;
</code></pre><p>The key point here is that if you have <code>&quot;e&quot; ...</code>, you could build an
<code>X</code> or a <code>Y</code> from that <code>&quot;e&quot;</code> (in fact, there can be any number of
<code>&quot;e&quot;</code> tokens). You ultimately decide based on whether the <code>&quot;e&quot;</code> tokens
are followed by a <code>&quot;c&quot;</code> (in which case you build an <code>X</code>) or a <code>&quot;d&quot;</code>
(in which case you build a <code>Y</code>).</p>
<p>LR(0), since it has no lookahead, can&rsquo;t handle this case. LALR(1)
<em>can</em>, since it augments LR(0) with a token of lookahead; using that,
after we see the <code>&quot;e&quot;</code>, we can peek at the next thing and figure out
what to do.</p>
<h4 id="step-1-construct-an-lr0-state-machine">Step 1: Construct an LR(0) state machine</h4>
<p>We begin by constructing an LR(0) state machine. If you&rsquo;re not familiar
with the process, I&rsquo;ll briefly outline it here, though you may want to
read up separately. Basically, we will enumerate a number of different <em>states</em>
indicating what kind of content we have seen so far. The first state <code>S0</code> indicates
that we are at the very beginning out &ldquo;goal item&rdquo; <code>G0</code>:</p>
<pre tabindex="0"><code>S0 = G0 = (*) X &#34;c&#34;
   | G0 = (*) Y &#34;d&#34;
   | ... // more items to be described later
</code></pre><p>The <code>G0 = (*) X &quot;c&quot;</code> indicates that we have started parsing a <code>G0</code>;
the <code>(*)</code> is how far we have gotten (namely, nowhere). There are two
items because there are two ways to make a <code>G0</code>. Now, in these two
items, immediately to the right of the <code>(*)</code> we see the symbols that
we expect to see next in the input: in this case, an <code>X</code> or a <code>Y</code>.
Since <code>X</code> and <code>Y</code> are nonterminals &ndash; i.e., symbols defined in the
grammar rather than tokens in the input &ndash; this means we might also be
looking at the beginning of an <code>X</code> or a <code>Y</code>, so we have to extend <code>S0</code>
to account for that possibility (these are sometimes called &ldquo;epsilon
moves&rdquo;, since these new possibilities arise from consuming <em>no</em> input,
which is denoted as &ldquo;epsilon&rdquo;):</p>
<pre tabindex="0"><code>S0 = G0 = (*) X &#34;c&#34;
   | G0 = (*) Y &#34;d&#34;
   | X = (*) &#34;e&#34; X
   | X = (*) &#34;e&#34;
   | Y = (*) &#34;e&#34; Y
   | Y = (*) &#34;e&#34;
</code></pre><p>This completes the state <code>S0</code>. Looking at these various possibilities,
we see that a number of things might come next in the input: a <code>&quot;e&quot;</code>,
an <code>X</code>, or a <code>Y</code>. (The <em>nonterminals</em> <code>X</code> and <code>Y</code> can &ldquo;come next&rdquo; once
we have seen their entire contents.) Therefore we construct three
successors states: one (<code>S1</code>) accounts for what happens when see an
<code>&quot;e&quot;</code>. The other two (<code>S3</code> and <code>S5</code> below) account for what happens
after we recognize an <code>X</code> or a <code>Y</code>, respectively.</p>
<p>S1 (what happens if we see an <code>&quot;e&quot;</code>) is derived by advancing the <code>(*)</code>
past the <code>&quot;e&quot;</code>:</p>
<pre tabindex="0"><code>S1 = X = &#34;e&#34; (*) X
   | X = &#34;e&#34; (*)
   | ... // to be added later
   | Y = &#34;e&#34; (*) Y
   | Y = &#34;e&#34; (*)
   | ... // to be added later
</code></pre><p>Here we dropped the <code>G0 = ...</code> possibilities, since those would have
required consuming a <code>X</code> or <code>Y</code>. But we have kept the <code>X = &quot;e&quot; (*) X</code>
etc. Again we find that there are nonterminals that can come next (<code>X</code>
and <code>Y</code> again) and hence we have to expand the state to account for
the possibility that, in addition to being partway through the <code>X</code> and
<code>Y</code>, we are at the beginning of <em>another</em> <code>X</code> or <code>Y</code>:</p>
<pre tabindex="0"><code>S1 = X = &#34;e&#34; (*) X
   | X = &#34;e&#34; (*)
   | X = (*) &#34;e&#34;     // added this
   | X = (*) &#34;e&#34; &#34;X&#34; // added this
   | Y = &#34;e&#34; (*) Y
   | Y = &#34;e&#34; (*)
   | Y = (*) &#34;e&#34;     // added this
   | Y = (*) &#34;e&#34; Y   // added this
</code></pre><p>Here again we expect either a <code>&quot;e&quot;</code>, an <code>&quot;X&quot;</code>, or a <code>&quot;Y&quot;</code>. If we again
check what happens when we consume an <code>&quot;e&quot;</code>, we will find that we
reach <code>S1</code> again (i.e., moving past an <code>&quot;e&quot;</code> gets us to the same set
of possibilities that we already saw). The remaining states all arise
from consuming an <code>&quot;X&quot;</code> or a <code>&quot;Y&quot;</code> from <code>S0</code> or <code>S1</code>:</p>
<pre tabindex="0"><code>S2 = X = &#34;e&#34; X (*)

S3 = G0 = X (*) &#34;c&#34;

S4 = Y = &#34;e&#34; Y (*)

S5 = G0 = Y (*) &#34;d&#34;

S6 = G0 = X &#34;c&#34; (*)

S7 = G0 = Y &#34;d&#34; (*)
</code></pre><p>We can represent the set of states as a graph, with edges representing
the transitions between states. The edges are labeled with the symbol
(<code>&quot;e&quot;</code>, <code>X</code>, etc) that gets consumed to move between states:</p>
<pre tabindex="0"><code>S0 -&#34;e&#34;-&gt; S1
S1 -&#34;e&#34;-&gt; S1
S1 --X--&gt; S2
S0 --X--&gt; S3
S1 --Y--&gt; S4
S0 --Y--&gt; S5
S3 -&#34;c&#34;-&gt; S6
S5 -&#34;d&#34;-&gt; S7
</code></pre><p><strong>Reducing and inconsistent states.</strong> Let&rsquo;s take another look at this
state S1:</p>
<pre tabindex="0"><code>S1 = X = &#34;e&#34; (*) X
   | X = &#34;e&#34; (*)      // reduce possible
   | X = (*) &#34;e&#34;
   | X = (*) &#34;e&#34; &#34;X&#34;
   | Y = &#34;e&#34; (*) Y
   | Y = &#34;e&#34; (*)      // reduce possible
   | Y = (*) &#34;e&#34;
   | Y = (*) &#34;e&#34; Y
</code></pre><p>There are two interesting things about this state. The first is that
it contains some items where the <code>(*)</code> comes at the very end, like <code>X = &quot;e&quot; (*)</code>. What this means is that we have seen an <code>&quot;e&quot;</code> in the
input, which is enough to construct an <code>X</code>. If we chose to do so, that
is called <strong>reducing</strong>. The effect would be to build up an <code>X</code>, which
would then be supplied as an input to a prior state (e.g., <code>S0</code>).</p>
<p>However, the other interesting thing is that our state actually has
<strong>three</strong> possible things it could do: it could reduce <code>X = &quot;e&quot; (*)</code>
to construct an <code>X</code>, but it could also reduce <code>Y = &quot;e&quot; (*)</code> to
construct a <code>Y</code>; finally, it can <strong>shift</strong> an <code>&quot;e&quot;</code>. Shifting means
that we do not execute any reductions, and instead we take the next
input and move to the next state (which in this case would be S1
again).</p>
<p>A state that can do both shifts and reduces, or more than one reduce,
is called an <strong>inconsistent state</strong>. Basically it means that there is
an ambiguity, and the parser won&rsquo;t be able to figure out what to do &ndash;
or, at least, it can&rsquo;t figure out what to do unless we take some
amount of lookahead into account. This is where LR(1) and LALR(1)
come into play.</p>
<p>In an LALR(1) grammar, we keep the same set of states, but we augment
the reductions with a bit of lookahead. In this example, as we will
see, that suffices &ndash; for example, if you look at the grammar, you
will see that we only need to do the <code>X = &quot;e&quot; (*)</code> reduction if the
next thing in the input is a <code>&quot;c&quot;</code>. And similarly we only need to do
the <code>Y = &quot;e&quot; (*)</code> reduction if the next thing is a <code>&quot;d&quot;</code>. So we can
transform the state to add some <em>conditions</em>, and then it is clear
what to do:</p>
<pre tabindex="0"><code>S1 = X = &#34;e&#34; (*) X    shift if next thing is a `X`
   | X = &#34;e&#34; (*)      reduce if next thing is a &#34;c&#34;
   | X = (*) &#34;e&#34;      shift if next thing is a &#34;e&#34;
   | X = (*) &#34;e&#34; &#34;X&#34;  shift if next thing is a &#34;e&#34;
   | Y = &#34;e&#34; (*) Y    shift if next thing is a `Y`
   | Y = &#34;e&#34; (*)      reduce if next thing is a &#34;d&#34;
   | Y = (*) &#34;e&#34;      shift if next thing is a &#34;e&#34;
   | Y = (*) &#34;e&#34; Y    shift if next thing is a &#34;e&#34;
</code></pre><p>Note that the shift vs reduce part is implied by where the <code>(*)</code> is:
we always shift unless the <code>(*)</code> is at the end. So usually we just
write the lookahead part. Moreover, the &ldquo;lookahead&rdquo; for a shift is
pretty obvious: it&rsquo;s whatever to the right of the <code>(*)</code>, so we&rsquo;ll
leave that out.  That leaves us with this, where <code>[&quot;c&quot;]</code> (for example)
means &ldquo;only do this reduction if the lookahead is <code>&quot;c&quot;</code>&rdquo;:</p>
<pre tabindex="0"><code>S1 = X = &#34;e&#34; (*) X
   | X = &#34;e&#34; (*)      [&#34;c&#34;]
   | X = (*) &#34;e&#34;
   | X = (*) &#34;e&#34; &#34;X&#34;
   | Y = &#34;e&#34; (*) Y
   | Y = &#34;e&#34; (*)      [&#34;d&#34;]
   | Y = (*) &#34;e&#34;
   | Y = (*) &#34;e&#34; Y
</code></pre><p>We&rsquo;ll call this augmented state a LR(0-1) state (it&rsquo;s not <em>quite</em>
how a LR(1) state is typically defined). Now that we&rsquo;ve added the
lookahead, this state is no longer inconsistent, as the parser always
knows what to do.</p>
<p>The next few sections will show how we can derive this lookahead
automatically.</p>
<h4 id="step-2-convert-lr0-states-into-lr0-1-states">Step 2: Convert LR(0) states into LR(0-1) states.</h4>
<p>The first step in the process is to naively convert all of our LR(0)
states into LR(0-1) states (with no additional lookahead). We will
denote the &ldquo;no extra lookahead&rdquo; case by writing a special &ldquo;wildcard&rdquo;
lookahead <code>_</code>. We will thus denote the inconsistent state after
transformation as follows, where each reduction has the &ldquo;wildcard&rdquo;
lookahead:</p>
<pre tabindex="0"><code>S1 = X = &#34;e&#34; (*) X
   | X = &#34;e&#34; (*)     [_]
   | X = (*) &#34;e&#34;
   | X = (*) &#34;e&#34; &#34;X&#34;
   | Y = &#34;e&#34; (*) Y
   | Y = &#34;e&#34; (*)     [_]
   | Y = (*) &#34;e&#34;
   | Y = (*) &#34;e&#34; Y
</code></pre><p>Naturally, the state is still inconsistent.</p>
<h4 id="step-3-resolve-inconsistencies">Step 3: Resolve inconsistencies.</h4>
<p>In the next step, we iterate over all of our LR(0-1) states. In this
example, we will not need to create new states, but in future examples
we will. The iteration thus consists of a queue and some code like
this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">queue</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Queue</span>::<span class="n">new</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">queue</span><span class="p">.</span><span class="n">extend</span><span class="p">(</span><span class="cm">/* all states */</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">while</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">s</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">queue</span><span class="p">.</span><span class="n">pop_front</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">if</span><span class="w"> </span><span class="cm">/* s is an inconsistent state */</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">resolve_inconsistencies</span><span class="p">(</span><span class="n">s</span><span class="p">,</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">queue</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h4 id="step-3a-build-the-lane-table">Step 3a: Build the lane table.</h4>
<p>To resolve an inconsistent state, we first construct a <strong>lane
table</strong>. This is done by the code in the <code>lane</code> module (the <code>table</code>
module maintains the data structure). It works by structing at each
conflict and tracing <strong>backwards</strong>. Let&rsquo;s start with the final table
we will get for the state S1 and then we will work our way back to how
it is constructed. First, let&rsquo;s identify the conflicting actions from
S1 and give them indices:</p>
<pre tabindex="0"><code>S1 = X = (*) &#34;e&#34;         // C0 -- shift &#34;e&#34;
   | X = &#34;e&#34; (*)     [_] // C1 -- reduce `X = &#34;e&#34; (*)`
   | X = (*) &#34;e&#34; &#34;X&#34;     // C0 -- shift &#34;e&#34;
   | X = &#34;e&#34; (*) X
   | Y = (*) &#34;e&#34;         // C0 -- shift &#34;e&#34;
   | Y = &#34;e&#34; (*)     [_] // C2 -- reduce `Y = &#34;e&#34; (*)`
   | Y = (*) &#34;e&#34; Y       // C0 -- shift &#34;e&#34;
   | Y = &#34;e&#34; (*) Y
</code></pre><p>Several of the items can cause &ldquo;Confliction Action 0&rdquo; (C0), which is
to shift an <code>&quot;e&quot;</code>. These are all mutually compatible. However, there
are also two incompatible actions: C1 and C2, both reductions. In
fact, we&rsquo;ll find that we look back at state S0, these &lsquo;conflicting&rsquo;
actions all occur with distinct lookahead. The purpose of the lane
table is to summarize that information. The lane table we will up
constructing for these conflicting actions is as follows:</p>
<pre tabindex="0"><code>| State | C0    | C1    | C2    | Successors |
| S0    |       | [&#34;c&#34;] | [&#34;d&#34;] | {S1}       |
| S1    | [&#34;e&#34;] | []    | []    | {S1}       |
</code></pre><p>Here the idea is that the lane table summarizes the lookahead
information contributed by each state. Note that for the <em>shift</em> the
state S1 already has enough lookahead information: we only shift when
we see the terminal we need next (&ldquo;e&rdquo;). But state C1 and C2, the lookahead
actually came from S0, which is a predecessor state.</p>
<p>As I said earlier, the algorithm for constructing the table works by
looking at the conflicting item and walking backwards. So let&rsquo;s
illustrate with conflict C1. We have the conflicting item <code>X = &quot;e&quot; (*)</code>, and we are basically looking to find its lookahead. We know
that somewhere in the distant past of our state machine there must be
an item like</p>
<pre><code>Foo = ...a (*) X ...b
</code></pre>
<p>that led us here. We want to find that item, so we can derive the
lookahead from <code>...b</code> (whatever symbols come after <code>X</code>).</p>
<p>To do this, we will walk the graph. Our state at any point in time
will be the pair of a state and an item in that state. To start out,
then, we have <code>(S1, X = &quot;e&quot; (*))</code>, which is the conflict C1. Because
the <code>(*)</code> is not at the &ldquo;front&rdquo; of this item, we have to figure out
where this <code>&quot;e&quot;</code> came from on our stack, so we look for predecessors
of the state S1 which have an item like <code>X = (*) e</code>. This leads us to
S0 and also S1. So we can push two states in our search: <code>(S0, X = (*) &quot;e&quot;)</code> and <code>(S1, X 5B= (*) &quot;e&quot;)</code>. Let&rsquo;s consider each in turn.</p>
<p>The next state is then <code>(S0, X = (*) &quot;e&quot;)</code>. Here the <code>(*)</code> lies at the
front of the item, so we search <strong>the same state</strong> S0 for items that
would have led to this state via an <em>epsilon move</em>.  This basically
means an item like <code>Foo = ... (*) X ...</code> &ndash; i.e., where the <code>(*)</code>
appears directly before the nonterminal <code>X</code>. In our case, we will find
<code>G0 = (*) X &quot;c&quot;</code>. This is great, because it tells us some lookahead
(&ldquo;c&rdquo;, in particular), and hence we can stop our search. We add to the
table the entry that the state S0 contributes lookahead &ldquo;c&rdquo; to the
conflict C1.  In some cases, we might find something like <code>Foo = ... (*) X</code> instead, where the <code>X</code> we are looking for appears at the
end. In that case, we have to restart our search, but looking for the
lookahead for <code>Foo</code>.</p>
<p>The next state in our case is <code>(S1, X = (*) e)</code>. Again the <code>(*)</code> lies
at the beginning and hence we search for things in the state S1 where
<code>X</code> is the next symbol. We find <code>X = &quot;e&quot; (*) X</code>. This is not as good
as last time, because there are no symbols appearing after X in this
item, so it does not contribute any lookahead. We therefore can&rsquo;t stop
our search yet, but we push the state <code>(S1, X = &quot;e&quot; (*) X)</code> &ndash; this
corresponds to the <code>Foo</code> state I mentioned at the end of the last
paragraph, except that in this case <code>Foo</code> is the same nonterminal <code>X</code>
we started with.</p>
<p>Looking at <code>(S1, X = &quot;e&quot; (*) X)</code>, we again have the <code>(*)</code> in the
middle of the item, so we move it left, searching for predecessors
with the item <code>X = (*) e X</code>. We will (again) find S0 and S1 have such
items. In the case of S0, we will (again) find the context &ldquo;c&rdquo;, which
we dutifully add to the table (this has no effect, since it is already
present). In the case of S1, we will (again) wind up at the state
<code>(S1, X = &quot;e&quot; (*) X)</code>.  Since we&rsquo;ve already visited this state, we
stop our search, it will not lead to new context.</p>
<p>At this point, our table column for C1 is complete. We can repeat the
process for C2, which plays out in an analogous way.</p>
<h4 id="step-3b-update-the-lookahead">Step 3b: Update the lookahead</h4>
<p>Looking at the lane table we built, we can union the context sets in
any particular column. We see that the context sets for each
conflicting action are pairwise disjoint. Therefore, we can simply
update each reduce action in our state with those lookaheads in mind,
and hence render it consistent:</p>
<pre tabindex="0"><code>S1 = X = (*) &#34;e&#34;
   | X = &#34;e&#34; (*)     [&#34;c&#34;] // lookahead from C1
   | X = (*) &#34;e&#34; &#34;X&#34;
   | X = &#34;e&#34; (*) X
   | Y = (*) &#34;e&#34;
   | Y = &#34;e&#34; (*)     [&#34;d&#34;] // lookahead from C2
   | Y = (*) &#34;e&#34; Y
   | Y = &#34;e&#34; (*) Y
</code></pre><p>This is of course also what the LALR(1) state would look like (though
it would include context for the other items, though that doesn&rsquo;t play
into the final machine execution).</p>
<p>At this point we&rsquo;ve covered enough to handle the grammar G0.  Let&rsquo;s
turn to a more complex grammar, grammar G1, and then we&rsquo;ll come back
to cover the remaining steps.</p>
<h3 id="second-example-the-grammar-g1">Second example: the grammar G1</h3>
<p>G1 is a (typo corrected) version of the grammar from the paper. This
grammar is not LALR(1) and hence it is more interesting, because it
requires splitting states.</p>
<h4 id="grammar-g1">Grammar G1</h4>
<pre tabindex="0"><code>G1 = &#34;a&#34; X &#34;d&#34;
   | &#34;a&#34; Y &#34;c&#34;
   | &#34;b&#34; X &#34;c&#34;
   | &#34;b&#34; Y &#34;d&#34;
X  = &#34;e&#34; X
   | &#34;e&#34;
Y  = &#34;e&#34; Y
   | &#34;e&#34;
</code></pre><p>The key point of this grammar is that when we see <code>... &quot;e&quot; &quot;c&quot;</code> and we
wish to know whether to reduce to <code>X</code> or <code>Y</code>, we don&rsquo;t have enough
information. We need to know what is in the <code>...</code>, because <code>&quot;a&quot; &quot;e&quot; &quot;c&quot;</code> means we reduce <code>&quot;e&quot;</code> to <code>Y</code> and <code>&quot;b&quot; &quot;e&quot; &quot;c&quot;</code> means we reduce to
<code>X</code>. In terms of our <em>state machine</em>, this corresponds to <em>splitting</em>
the states responsible for X and Y based on earlier context.</p>
<p>Let&rsquo;s look at a subset of the LR(0) states for G1:</p>
<pre tabindex="0"><code>S0 = G0 = (*) &#34;a&#34; X &#34;d&#34;
   | G0 = (*) &#34;a&#34; Y &#34;c&#34;
   | G0 = (*) &#34;b&#34; X &#34;c&#34;
   | G0 = (*) &#34;b&#34; X &#34;d&#34;
   
S1 = G0 = &#34;a&#34; (*) X &#34;d&#34;
   | G0 = &#34;a&#34; (*) Y &#34;c&#34;
   | X = (*) &#34;e&#34; X
   | X = (*) &#34;e&#34;
   | Y = (*) &#34;e&#34; Y
   | Y = (*) &#34;e&#34;

S2 = G0 = &#34;b&#34; (*) X &#34;c&#34;
   | G0 = &#34;b&#34; (*) Y &#34;d&#34;
   | X = (*) &#34;e&#34; X
   | X = (*) &#34;e&#34;
   | Y = (*) &#34;e&#34; Y
   | Y = (*) &#34;e&#34;

S3 = X = &#34;e&#34; (*) X
   | X = &#34;e&#34; (*)      // C1 -- can reduce
   | X = (*) &#34;e&#34;      // C0 -- can shift &#34;e&#34;
   | X = (*) &#34;e&#34; &#34;X&#34;  // C0 -- can shift &#34;e&#34;
   | Y = &#34;e&#34; (*) Y
   | Y = &#34;e&#34; (*)      // C2 -- can reduce
   | Y = (*) &#34;e&#34;      // C0 -- can shift &#34;e&#34;
   | Y = (*) &#34;e&#34; Y    // C0 -- can shift &#34;e&#34;
</code></pre><p>Here we can see the problem. The state S3 is inconsistent. But it is
reachable from both S1 and S2. If we come from S1, then we can have (e.g.)
<code>X &quot;d&quot;</code>, but if we come from S2, we expect <code>X &quot;c&quot;</code>.</p>
<p>Let&rsquo;s walk through our algorithm again. I&rsquo;ll start with step 3a.</p>
<h4 id="step-3a-build-the-lane-table-1">Step 3a: Build the lane table.</h4>
<p>The lane table for state S3 will look like this:</p>
<pre tabindex="0"><code>| State | C0    | C1    | C2    | Successors |
| S1    |       | [&#34;d&#34;] | [&#34;c&#34;] | {S3}       |
| S2    |       | [&#34;c&#34;] | [&#34;d&#34;] | {S3}       |
| S3    | [&#34;e&#34;] | []    | []    | {S3}       |
</code></pre><p>Now if we union each column, we see that both C1 and C2 wind up with
lookahead <code>{&quot;c&quot;, &quot;d&quot;}</code>. This is our problem. We have to isolate things
better. Therefore, step 3b (&ldquo;update lookahead&rdquo;) does not apply. Instead
we attempt step 3c.</p>
<h4 id="step-3c-isolate-lanes">Step 3c: Isolate lanes</h4>
<p>This part of the algorithm is only loosely described in the paper, but
I think it works as follows. We will employ a union-find data
structure. With each set, we will record a &ldquo;context set&rdquo;, which
records for each conflict the set of lookahead tokens (e.g.,
<code>{C1:{&quot;d&quot;}}</code>).</p>
<p>A context set tells us how to map the lookahead to an action;
therefire, to be self-consistent, the lookaheads for each conflict
must be mutually disjoint. In other words, <code>{C1:{&quot;d&quot;}, C2:{&quot;c&quot;}}</code> is
valid, and says to do C1 if we see a &ldquo;d&rdquo; and C2 if we see a &ldquo;c&rdquo;. But
<code>{C1:{&quot;d&quot;}, C2:{&quot;d&quot;}}</code> is not, because there are two actions.</p>
<p>Initially, each state in the lane table is mapped to itself, and the
conflict set is derived from its column in the lane table:</p>
<pre tabindex="0"><code>S1 = {C1:d, C2:c}
S2 = {C1:c, C2:d}
S3 = {C0:e}
</code></pre><p>We designate &ldquo;beachhead&rdquo; states as those states in the table that are
not reachable from another state in the table (i.e., using the
successors). In this case, those are the states S1 and S2. We will be
doing a DFS through the table and we want to use those as the starting
points.</p>
<p>(Question: is there always at least one beachhead state? Seems like
there must be.)</p>
<p>So we begin by iterating over the beachhead states.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">for</span><span class="w"> </span><span class="n">beachhead</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="n">beachheads</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>When we visit a state X, we will examine each of its successors Y. We
consider whether the context set for Y can be merged with the context
set for X. So, in our case, X will be S1 to start and Y will be S3.
In this case, the context set can be merged, and hence we union S1, S3
and wind up with the following union-find state:</p>
<pre tabindex="0"><code>S1,S3 = {C0:e, C1:d, C2:c}
S2    = {C1:c, C2:d}
</code></pre><p>(Note that this union is just for the purpose of tracking context; it
doesn&rsquo;t imply that S1 and S3 are the &lsquo;same states&rsquo; or anything like
that.)</p>
<p>Next we examine the edge S3 -&gt; S3. Here the contexts are already
merged and everything is happy, so we stop. (We already visited S3,
after all.)</p>
<p>This finishes our first beachhead, so we proceed to the next edge, S2
-&gt; S3. Here we find that we <strong>cannot</strong> union the context: it would
produce an inconsistent state. So what we do is we <strong>clone</strong> S3 to
make a new state, S3&rsquo;, with the initial setup corresponding to the row
for S3 from the lane table:</p>
<pre tabindex="0"><code>S1,S3 = {C0:e, C1:d, C2:c}
S2    = {C1:c, C2:d}
S3&#39;   = {C0:e}
</code></pre><p>This also involves updating our LR(0-1) state set to have a new state
S3&rsquo;. All edges from S2 that led to S3 now lead to S3&rsquo;; the outgoing
edges from S3&rsquo; remain unchanged. (At least to start.)</p>
<p>Therefore, the edge <code>S2 -&gt; S3</code> is now <code>S2 -&gt; S3'</code>. We can now merge
the conflicts:</p>
<pre tabindex="0"><code>S1,S3  = {C0:e, C1:d, C2:c}
S2,S3&#39; = {C0:e, C1:c, C2:d}
</code></pre><p>Now we examine the outgoing edge S3&rsquo; -&gt; S3. We cannot merge these
conflicts, so we search (greedily, I guess) for a clone of S3 where we
can merge the conflicts. We find one in S3&rsquo;, and hence we redirect the
S3 edge to S3&rsquo; and we are done. (I think the actual search we want is
to make first look for a clone of S3 that is using literally the same
context as us (i.e., same root node), as in this case. If that is not
found, <em>then</em> we search for one with a mergable context. If <em>that</em>
fails, then we clone a new state.)</p>
<p>The final state thus has two copies of S3, one for the path from S1,
and one for the path from S2, which gives us enough context to
proceed.</p>
<h3 id="conclusion">Conclusion</h3>
<p>As I wrote, I&rsquo;ve been experimenting with the Lane Table algorithm in LALRPOP and
I now have a simple prototype that seems to work. It is not by any means
exhaustively tested &ndash; in fact, I&rsquo;d call it minimally tested &ndash; but hopefully
I&rsquo;ll find some time to play around with it some more and take it through
its paces. It at least handles the examples in the paper.</p>
<p>The implementation is also inefficient in various ways. Some of them
are minor &ndash; it clones more than it needs to, for example &ndash; and
easily corrected. But I also suspect that one can do a lot more
caching and sharing of results. Right now, for example, I construct
the lane table for each inconsistent state completely from scratch,
but perhaps there are ways to preserve and share results (it seems
naively as if this should be possible). On the other hand,
constructing the lane table can probably be made pretty fast: it
doesn&rsquo;t have to traverse that much of the grammar. I&rsquo;ll have to try it
on some bigger examples and see how it scales.</p>
<h3 id="edits">Edits</h3>
<ul>
<li>The lane table I originally described had the wrong value for the
successor column. Corrected.</li>
</ul>
]]></content></entry><entry><title type="html">Nested method calls via two-phase borrowing</title><link href="https://smallcultfollowing.com/babysteps/blog/2017/03/01/nested-method-calls-via-two-phase-borrowing/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2017/03/01/nested-method-calls-via-two-phase-borrowing/</id><published>2017-03-01T00:00:00+00:00</published><updated>2017-03-01T00:00:00+00:00</updated><content type="html"><![CDATA[<p>In my previous post, I
<a href="https://smallcultfollowing.com/babysteps/
/blog/2017/02/21/non-lexical-lifetimes-using-liveness-and-location/">outlined a plan for non-lexical lifetimes</a>. I wanted to write a
follow-up post today that discusses different ways that we can extend
the system to support nested mutable calls. The ideas here are based
on some the ideas that emerged in a
<a href="https://internals.rust-lang.org/t/accepting-nested-method-calls-with-an-mut-self-receiver/4588">recent discussion on internals</a>, although what I describe
here is a somewhat simplified variant. If you want more background,
it&rsquo;s worth reading at least the top post in the thread, where I laid
out a lot of the history here. I&rsquo;ll try to summarize the key bits as I
go.</p>
<h3 id="the-problem-wed-like-to-solve">The problem we&rsquo;d like to solve</h3>
<p><em>This section is partially copied from the internals post; if you&rsquo;ve read
that, feel free to skip or skim.</em></p>
<p>The overriding goal here is that we want to accept nested method calls
where the outer call is an <code>&amp;mut self</code> method, like
<code>vec.push(vec.len())</code>. This is a common limitation that beginners
stumble over and find confusing and which experienced users have as a
persistent annoyance. This makes it a natural target to eliminate as
part of the <a href="https://github.com/rust-lang/rfcs/blob/master/text/1774-roadmap-2017.md">2017 Roadmap</a>.</p>
<p>You may wonder why this code isn&rsquo;t accepted in the first place. To see
why, consider what the resulting MIR looks like (I&rsquo;m going to number
the statements for later reference in the post):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cm">/* 0 */</span><span class="w"> </span><span class="n">tmp0</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">vec</span><span class="p">;</span><span class="w">       </span><span class="c1">// mutable borrow starts here.. -+
</span></span></span><span class="line"><span class="cl"><span class="cm">/* 1 */</span><span class="w"> </span><span class="n">tmp1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">vec</span><span class="p">;</span><span class="w"> </span><span class="c1">// &lt;-- shared borrow overlaps here         |
</span></span></span><span class="line"><span class="cl"><span class="cm">/* 2 */</span><span class="w"> </span><span class="n">tmp2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Vec</span>::<span class="n">len</span><span class="p">(</span><span class="n">tmp1</span><span class="p">);</span><span class="w"> </span><span class="c1">//                               |
</span></span></span><span class="line"><span class="cl"><span class="cm">/* 3 */</span><span class="w"> </span><span class="nb">Vec</span>::<span class="n">push</span><span class="p">(</span><span class="n">tmp0</span><span class="p">,</span><span class="w"> </span><span class="n">tmp2</span><span class="p">);</span><span class="w"> </span><span class="c1">// &lt;--.. and ends here-----------+
</span></span></span></code></pre></div><p>As you can see, we first take a mutable reference to <code>vec</code> for
<code>tmp0</code>. This &ldquo;locks&rdquo; <code>vec</code> from being accessed in any other way until
after the call to <code>Vec::push()</code>, but then we try to access it again
when calling <code>vec.len()</code>. Hence the error.</p>
<p>When you see the code desugared in that way, it should not surprise
you that there is in fact a real danger here for code to crash if we
just &ldquo;turned off&rdquo; this check (if we even could do such a thing). For
example, consider this rather artificial Rust program:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">v</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="nb">String</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[</span><span class="fm">format!</span><span class="p">(</span><span class="s">&#34;Hello, &#34;</span><span class="p">)];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">v</span><span class="p">[</span><span class="mi">0</span><span class="p">].</span><span class="n">push_str</span><span class="p">({</span><span class="w"> </span><span class="n">v</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="fm">format!</span><span class="p">(</span><span class="s">&#34;foo&#34;</span><span class="p">));</span><span class="w"> </span><span class="s">&#34;World!&#34;</span><span class="w"> </span><span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">//              ^^^^^^^^^^^^^^^^^^^^^^ sneaky attempt to mutate `v`
</span></span></span></code></pre></div><p>The problem is that, when we desugar this, we get:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">v</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="nb">String</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[</span><span class="fm">format!</span><span class="p">(</span><span class="s">&#34;Hello, &#34;</span><span class="p">)];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// creates a reference into `v`&#39;s current data array:
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">arg0</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="nb">String</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">v</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">arg1</span>: <span class="kp">&amp;</span><span class="kt">str</span> <span class="o">=</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// potentially frees `v`&#39;s data array:
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">v</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="fm">format!</span><span class="p">(</span><span class="s">&#34;foo&#34;</span><span class="p">));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="s">&#34;World!&#34;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// uses pointer into data array that may have been freed:
</span></span></span><span class="line"><span class="cl"><span class="nb">String</span>::<span class="n">push_str</span><span class="p">(</span><span class="n">arg0</span><span class="p">,</span><span class="w"> </span><span class="n">arg1</span><span class="p">)</span><span class="w">
</span></span></span></code></pre></div><p>So, to put it another way, as we evaluate the arguments, we are
creating references and pointers that we will give to the final
function. But evaluating arguments can also have arbitrary
side-effects, which might invalidate the references that we prepared
for earlier arguments. So we have to be sure to rule that out.</p>
<p>In fact, even when the receiver is just a local variable (e.g.,
<code>vec.push(vec.len())</code>) we have to be wary. We wouldn&rsquo;t want it to be
possible to give ownership of the receiver away in one of the
arguments: <code>vec.push({ send_to_another_thread(vec); ... })</code>. That
should still be an error of course.</p>
<p>(Naturally, these complex arguments that are blocks look really
artificial, but keep in mind that most of the time when this occurs in
practice, the argument is a method or fn call, and that could in
principle have arbitrary side-effects.)</p>
<h3 id="how-can-we-fix-this">How can we fix this?</h3>
<p>Now, we could address this by changing how we desugar method calls
(and indeed the <a href="https://internals.rust-lang.org/t/accepting-nested-method-calls-with-an-mut-self-receiver/4588">original post on the internals thread</a>
contained two such alternatives). But I am more interested in seeing
if we can keep the current desugaring, but enrich the lifetime and
borrowing system so that it type-checks for cases that we can see
won&rsquo;t lead to a crash (such as this one).</p>
<p>The key insight is that, today, when we execute the mutable borrow of
<code>vec</code>, we start a borrow <strong>immediately</strong>, even though the reference
(<code>arg0</code>, here) is not going to be used until later:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cm">/* 0 */</span><span class="w"> </span><span class="n">tmp0</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">vec</span><span class="p">;</span><span class="w">   </span><span class="c1">// mutable borrow created here..
</span></span></span><span class="line"><span class="cl"><span class="cm">/* 1 */</span><span class="w"> </span><span class="n">tmp1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">vec</span><span class="p">;</span><span class="w"> </span><span class="c1">// &lt;-- shared borrow overlaps here         |
</span></span></span><span class="line"><span class="cl"><span class="cm">/* 2 */</span><span class="w"> </span><span class="n">tmp2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Vec</span>::<span class="n">len</span><span class="p">(</span><span class="n">tmp1</span><span class="p">);</span><span class="w"> </span><span class="c1">//                               |
</span></span></span><span class="line"><span class="cl"><span class="cm">/* 3 */</span><span class="w"> </span><span class="nb">Vec</span>::<span class="n">push</span><span class="p">(</span><span class="n">tmp0</span><span class="p">,</span><span class="w"> </span><span class="n">tmp2</span><span class="p">);</span><span class="w"> </span><span class="c1">// ..but not used until here!
</span></span></span></code></pre></div><p>The proposal &ndash; which I will call <strong>two-phased mutable borrows</strong> &ndash; is
to modify the borrow-checker so that mutable borrows operate in <strong>two
phases</strong>:</p>
<ul>
<li>When an <code>&amp;mut</code> reference is first created, but before it is used,
the borrowed path (e.g., <code>vec</code>) is considered <strong>reserved</strong>. A
reserved path is subject to the same restrictions as a shared borrow
&ndash; reads are ok, but moves and writes are not (except under a
<code>Cell</code>).</li>
<li>Once you start using the reference in some way, the path is
considered <strong>mutably borrowed</strong> and is subject to the usual
restrictions.</li>
</ul>
<p>So, in terms of our example, when we execute the MIR statement <code>tmp0 = &amp;mut vec</code>, that creates a <strong>reservation</strong> on <code>vec</code>, but doesn&rsquo;t start
the actual borrow yet. <code>tmp0</code> is not used until line 3, so that means
that for lines 1 and 2, <code>vec</code> is only reserved. Therefore, it&rsquo;s ok to
share <code>vec</code> (as line 1 does) so long as the resulting reference
(<code>tmp1</code>) is dead as we enter line 3. Since <code>tmp1</code> is only used to call
<code>Vec::len()</code>, we&rsquo;re all set!</p>
<h3 id="code-we-would-not-accept">Code we would not accept</h3>
<p>To help understand the rule, let&rsquo;s look at a few other examples, but
this time we&rsquo;ll consider examples that would be rejected as illegal
(both today and under the new rules). We&rsquo;ll start with the example we
saw before that could have trigged a use-after-free:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">v</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="nb">String</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[</span><span class="fm">format!</span><span class="p">(</span><span class="s">&#34;Hello, &#34;</span><span class="p">)];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">v</span><span class="p">[</span><span class="mi">0</span><span class="p">].</span><span class="n">push_str</span><span class="p">({</span><span class="w"> </span><span class="n">v</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="fm">format!</span><span class="p">(</span><span class="s">&#34;foo&#34;</span><span class="p">));</span><span class="w"> </span><span class="s">&#34;World!&#34;</span><span class="w"> </span><span class="p">});</span><span class="w">
</span></span></span></code></pre></div><p>We can <em>partially</em> desugar the call to <code>push_str()</code> into MIR
that would look something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cm">/* 0 */</span><span class="w"> </span><span class="n">tmp0</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">v</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="cm">/* 1 */</span><span class="w"> </span><span class="n">tmp1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">IndexMut</span>::<span class="n">index_mut</span><span class="p">(</span><span class="n">tmp0</span><span class="p">,</span><span class="w"> </span><span class="mi">0</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="cm">/* 2 */</span><span class="w"> </span><span class="n">tmp2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">v</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="cm">/* 3 */</span><span class="w"> </span><span class="nb">Vec</span>::<span class="n">push</span><span class="p">(</span><span class="n">tmp2</span><span class="p">,</span><span class="w"> </span><span class="fm">format!</span><span class="p">(</span><span class="s">&#34;foo&#34;</span><span class="p">));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="cm">/* 4 */</span><span class="w"> </span><span class="n">tmp3</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">&#34;World!&#34;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="cm">/* 5 */</span><span class="w"> </span><span class="nb">Vec</span>::<span class="n">push_str</span><span class="p">(</span><span class="n">tmp1</span><span class="p">,</span><span class="w"> </span><span class="n">tmp3</span><span class="p">);</span><span class="w">
</span></span></span></code></pre></div><p>In one sense, this example turns out to be not that interesting in
terms of the new rules. This is because <code>v[0]</code> is actually an
overloaded operator; when we desugar it, we see that <code>v</code> would be
reserved on line 0 and then (mutably) borrowed starting on line 1.
This borrow extends as long as <code>tmp1</code> is in use, which is to say, for
the remainder of the example. Therefore, line 2 is an error, because
we cannot have two mutable borrows at once.</p>
<p>However, in another sense, this example is very interesting: this is
because it shows how, while the new system is more expressive, it
preserves the existing behavior of safe abstractions. That is,
<a href="https://doc.rust-lang.org/std/ops/trait.IndexMut.html#tymethod.index_mut">the <code>index_mut()</code> method</a> has a signature like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">index_mut</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="bp">Self</span>::<span class="n">Output</span><span class="w">
</span></span></span></code></pre></div><p>Since calling this method is going to &ldquo;use&rdquo; the receiver, and hence
activate the borrow, the method is guaranteed that as long as its
return value is in use, the caller will not be able to access the
receiver. This is precisely how it works today as well.</p>
<p>The next example is artificial but inspired by one that is covered
in my original post to the internals thread:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cm">/*0*/</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="cm">/*1*/</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">p</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">i</span><span class="p">;</span><span class="w"> </span><span class="c1">// (reservation of `i` starts here)
</span></span></span><span class="line"><span class="cl"><span class="cm">/*2*/</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">j</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">i</span><span class="p">;</span><span class="w">      </span><span class="c1">// OK: `i` is only reserved here
</span></span></span><span class="line"><span class="cl"><span class="cm">/*3*/</span><span class="w"> </span><span class="o">*</span><span class="n">p</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">        </span><span class="c1">// (mutable borrow of `i` starts here, since `p` is used)
</span></span></span><span class="line"><span class="cl"><span class="cm">/*4*/</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">k</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">i</span><span class="p">;</span><span class="w">      </span><span class="c1">// ERROR: `i` is mutably borrowed here
</span></span></span><span class="line"><span class="cl"><span class="cm">/*5*/</span><span class="w"> </span><span class="o">*</span><span class="n">p</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">       
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="c1">// (mutable borrow ends here, since `p` is not used after this point)
</span></span></span></code></pre></div><p>This code fails to compile as well. What happens, as you can see in
the comments, is that <code>i</code> is considered <em>reserved</em> during the first
read, but once we start using <code>p</code> on line 3, <code>i</code> is considered
borrowed. Hence the second read (on line 4) results in an
error. Interestingly, if line 5 were to be removed, then the program
would be accepted (at least once we move to <a href="https://smallcultfollowing.com/babysteps/
/blog/2017/02/21/non-lexical-lifetimes-using-liveness-and-location/">NLL</a>), since the borrow
only extends until the last use of <code>p</code>.</p>
<p>The final example shows that this analysis doesn&rsquo;t permit <strong>any</strong> kind
of nesting you might want. In particular, for better or worse, it does
not permit calls to <code>&amp;mut self</code> methods to be nested inside of a call
to an <code>&amp;self</code> method. This means that something like
<code>vec.get({vec.push(2); 0})</code> would be illegal. To see why, let&rsquo;s check
out the (partial) MIR desugaring:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cm">/* 0 */</span><span class="w"> </span><span class="n">tmp0</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">vec</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="cm">/* 1 */</span><span class="w"> </span><span class="n">tmp1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">vec</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="cm">/* 2 */</span><span class="w"> </span><span class="nb">Vec</span>::<span class="n">push</span><span class="p">(</span><span class="n">tmp1</span><span class="p">,</span><span class="w"> </span><span class="mi">2</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="cm">/* 3 */</span><span class="w"> </span><span class="nb">Vec</span>::<span class="n">get</span><span class="p">(</span><span class="n">tmp0</span><span class="p">,</span><span class="w"> </span><span class="mi">0</span><span class="p">);</span><span class="w">
</span></span></span></code></pre></div><p>Now, you might expect that this would be accepted, because the borrow
on line 0 would not be active until line 3. But this isn&rsquo;t quite
right, for two reasons.  First, as I described it, only mutable
borrows have a reserve/active cycle, shared borrows start right
away. And the reason for this is that <strong>when a path is reserved, it
acts the same as if it had been shared</strong>. So, in other words, even if
we used two-phase borrowing for shared borrows, it would make no
difference (which is why I described reservations as only applying to
mutable borrows). At the end of the post, I&rsquo;ll describe how we could
&ndash; if we wanted &ndash; support examples like this, at the cost of making
the system slightly more complex.</p>
<h3 id="how-to-implement-it">How to implement it</h3>
<p>The way I envision implementing this rule is part of borrow check.
Borrow check is the final pass that executes as part of the compiler&rsquo;s
safety checking procedure. In case you&rsquo;re not familiar with how the
compiler works, Rust&rsquo;s safety check is done using three passes:</p>
<ul>
<li>Normal type check (like any other language);</li>
<li>Lifetime check (infers the lifetimes for each reference, as described in <a href="https://smallcultfollowing.com/babysteps/
/blog/2017/02/21/non-lexical-lifetimes-using-liveness-and-location/">my previous post</a>);</li>
<li>Borrow check (using the lifetimes for each borrow, checks that all uses are acceptable,
and that variables are not moved).</li>
</ul>
<h4 id="how-borrow-check-would-work-before-this-proposal">How borrow check would work before this proposal</h4>
<p>Before two-phase borrows, then, the way the borrow-check would begin
is to iterate over every borrow in the program. Since the lifetime
check has completed, we know the lifetimes of every reference and
every borrow. In MIR, borrows always look like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">var</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;lt</span><span class="w"> </span><span class="k">mut</span><span class="o">?</span><span class="w"> </span><span class="n">lvalue</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">//   ^^^ ^^^^
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">//   |   |
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">//   |   distinguish `&amp;mut` or `&amp;` borrow
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">//   lifetime of borrow
</span></span></span></code></pre></div><p>This says &ldquo;borrow <code>lvalue</code> for the lifetime <code>'lt</code>&rdquo; (recall that, under
NLL,
[each lifetime is a set of points in the MIR control-flow graph][wil]). So
we would go and, for each point in <code>'lt</code>, add <code>lvalue</code> to the list of
borrowed things at that point.  If we find that <code>lvalue</code> is already
borrowed at that point, we would check that the two borrows are
compatible (both must be shared borrows).</p>
<p>At this point, we now have a list of what is borrowed at each point in
the program, and whether that is a shared or mutable borrow. We can then
iterate over all statements and check that they are using the values in
a compatible way. So, for example, if we see a MIR statement like:</p>
<pre><code>k = i // where k, i are integers
</code></pre>
<p>then this would be illegal if <code>k</code> is borrowed in any way (shared or
mutable).  It would also be illegal if <code>i</code> is mutably borrowed.
Similarly, it is an error if we see a move from a path <code>p</code> when <code>p</code> is
borrowed (directly or indirectly). And so forth.</p>
<h4 id="supporting-two-phases">Supporting two-phases</h4>
<p>To support two-phases, we can extend borrow-check in a simple way.
When we encounter a mutable borrow:</p>
<pre><code>var = &amp;'lt mut lvalue;
</code></pre>
<p>we do not go and immediately mark <code>lvalue</code> as borrowed for all the
points in <code>'lt</code>. Instead, we find the points <code>A</code> in <code>'lt</code> where the
borrow is <strong>active</strong>. This corresponds to any point where <code>var</code> is
used and any point that is reachable from a use (this is a very simple
inductive definition one can easily find with a data-flow
analysis). For each point in <code>A</code>, we mark that <code>lvalue</code> is mutably
borrowed. For the points <code>'lt - U</code>, we would mark <code>lvalue</code> as merely
<em>reserved</em>. We can then do the next part of the check just as before,
except that anywhere that an lvalue is treated as reserved, it is
subject to the same restrictions as if it were shared.</p>
<h3 id="comparing-to-other-approaches">Comparing to other approaches</h3>
<p>There have been a number of proposals aimed at solving this same
problem.  This particular proposal is, I believe, a new variant, but
it accepts a similar set of programs to the other proposals. I wanted
to compare and contrast it a bit with prior ideas and try to explain
why I framed it in just this way.</p>
<h4 id="borrowing-for-the-future">Borrowing for the future.</h4>
<p>My own first stab at this problem was using the idea of &ldquo;borrowing for
the future&rdquo;, <a href="https://internals.rust-lang.org/t/accepting-nested-method-calls-with-an-mut-self-receiver/4588">described in the internals thread</a>. The basic
idea was that the lifetime of a borrow would be inferred <strong>to start on
the first use</strong>, and the borrow checker, when it sees a borrow that
doesn&rsquo;t start immediately, would consider the path &ldquo;reserved&rdquo; until
the start. This is obviously very close to what I have presented
here. <strong>The key difference is that here the borrow checker itself
computes the active vs reserved portions of the borrow, rather than
this computation being done in lifetime inference.</strong></p>
<p>This seems to me to be more appropriate: lifetime inference figures
out how long a given reference is live (may later be used), based on
the type system and its rules. The borrow checker then uses that
information to figure out if the program may cause the reference to be
invalidated.</p>
<p>The formulation I presented here also fits much better with the
<a href="https://smallcultfollowing.com/babysteps/
/blog/2017/02/21/non-lexical-lifetimes-using-liveness-and-location/">NLL rules</a> that I presented previously. This is because it
allows us to keep the rule that when a reference is <em>live</em> at some
point P (may be dereferenced later), its lifetime include that point
P. To see what I mean, let&rsquo;s reconsider our original example, but in
the &ldquo;borrowing for the future&rdquo; scheme. I&rsquo;ll annotate lifetimes using
braces to describe sets:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cm">/* 0 */</span><span class="w"> </span><span class="n">tmp0</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="p">{</span><span class="mi">3</span><span class="p">}</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">vec</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="cm">/* 1 */</span><span class="w"> </span><span class="n">tmp1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">vec</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="cm">/* 2 */</span><span class="w"> </span><span class="n">tmp2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Vec</span>::<span class="n">len</span><span class="p">(</span><span class="n">tmp1</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="cm">/* 3 */</span><span class="w"> </span><span class="nb">Vec</span>::<span class="n">push</span><span class="p">(</span><span class="n">tmp0</span><span class="p">,</span><span class="w"> </span><span class="n">tmp2</span><span class="p">);</span><span class="w">
</span></span></span></code></pre></div><p>Here <code>tmp0</code> would have the type <code>&amp;{3} mut Vec</code>, but <code>tmp0</code> is clearly
live at point 1 (i.e., it will be used later, on line 3). So we would
have to make the <a href="https://smallcultfollowing.com/babysteps/
/blog/2017/02/21/non-lexical-lifetimes-using-liveness-and-location/">NLL rules</a> that I outlined later incorporate a
more complex invariant, one that considers two-phase borrows as a
first-class thing (cue next piece of &lsquo;related work&rsquo; in 1&hellip;2&hellip;3&hellip;.).</p>
<h3 id="two-phase-lifetimes">Two-phase lifetimes</h3>
<p>In the internals thread, arielb1 had [an interesting proposal][ref2]
that they called &ldquo;two-phase lifetimes&rdquo;. The goal was precisely to take
the &ldquo;two-phase&rdquo; concept but incorporate it into lifetime inference,
rather than handling it in borrow checking as I present here. The idea
was to define a type <code>RefMut&lt;'r, 'w, T&gt;</code><sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> which stands in for a
kind of &ldquo;richer&rdquo; <code>&amp;mut</code> type.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> In particular, it has two
lifetimes:</p>
<ul>
<li><code>'r</code> is the &ldquo;read&rdquo; lifetime. It includes every point where the reference
may later be used.</li>
<li><code>'w</code> is a subset of <code>'r</code> (that is, <code>'r: 'w</code>) which indicates the &ldquo;write&rdquo; lifetime.
This includes those points where the reference is actively being written.</li>
</ul>
<p>We can then conservatively translate a <code>&amp;'a mut T</code> type into
<code>RefMut&lt;'a, 'a, T&gt;</code> &ndash; that is, we can use <code>'a</code> for both of the two
lifetimes. This is what we would do for any <code>&amp;mut</code> type that appears
in a struct declaration or fn interface. But for <code>&amp;mut T</code> types within
a fn body, we can infer the two lifetimes somewhat separately: the
<code>'r</code> lifetime is computed just as I described in my
<a href="https://smallcultfollowing.com/babysteps/
/blog/2017/02/21/non-lexical-lifetimes-using-liveness-and-location/">NLL post</a>. But the <code>'w</code> lifetime only needs to include those
points where a write occurs. The borrow check would then guarantee
that the <code>'w</code> regions of every <code>&amp;mut</code> borrow is disjoint from the <code>'r</code>
regions of every other borrow (and from shared borrows).</p>
<p>This proposal accepts more programs than the one I outlined. In
particular, it accepts the example with interleaved reads and writes
that we saw earlier. Let me give that example again, but annotation
the regions more explicitly:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cm">/* 0 */</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="cm">/* 1 */</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">p</span>: <span class="nc">RefMut</span><span class="o">&lt;</span><span class="p">{</span><span class="mi">2</span><span class="o">-</span><span class="mi">5</span><span class="p">},</span><span class="w"> </span><span class="p">{</span><span class="mi">3</span><span class="p">,</span><span class="mi">5</span><span class="p">},</span><span class="w"> </span><span class="kt">i32</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">i</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">//                    ^^^^^  ^^^^^
</span></span></span><span class="line"><span class="cl"><span class="c1">//                     &#39;r     &#39;w
</span></span></span><span class="line"><span class="cl"><span class="cm">/* 2 */</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">j</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">i</span><span class="p">;</span><span class="w">  </span><span class="c1">// just in &#39;r
</span></span></span><span class="line"><span class="cl"><span class="cm">/* 3 */</span><span class="w"> </span><span class="o">*</span><span class="n">p</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">    </span><span class="c1">// must be in &#39;w
</span></span></span><span class="line"><span class="cl"><span class="cm">/* 4 */</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">k</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">i</span><span class="p">;</span><span class="w">  </span><span class="c1">// just in &#39;r
</span></span></span><span class="line"><span class="cl"><span class="cm">/* 5 */</span><span class="w"> </span><span class="o">*</span><span class="n">p</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">    </span><span class="c1">// must be in &#39;w
</span></span></span></code></pre></div><p>As you can see here, we would infer the write region to be just the
two points 3 and 5. This is precisely those portions of the CFG where
writes are happening &ndash; and not the gaps in between, where reads are
permitted.</p>
<h4 id="why-i-do-not-want-to-support-discontinuous-borrows">Why I do not want to support discontinuous borrows</h4>
<p>As you might have surmised, these sorts of &ldquo;discontinuous&rdquo; borrows
represent a kind of &ldquo;step up&rdquo; in the complexity of the system. If it
were vital to accept examples with interleaved writes like the
previous one, then this wouldn&rsquo;t bother me (NLL also represents such a
step, for example, but it seems clearly worth it). But given that the
example is artificial and not a pattern I have ever seen arise in
&ldquo;real life&rdquo;, it seems like we should try to avoid growing the
underlying complexity of the system if we can.</p>
<p>To see what I mean about a &ldquo;step up&rdquo; in complexity, consider how we
would integrate this proposal into lifetime inference. The current
rules treat all regions equally, but this proposal seems to imply that
regions have &ldquo;roles&rdquo;.  For example, the <code>'r</code> region captures the
&ldquo;liveness&rdquo; constraints that I described in the original NLL
proposal. Meanwhile the <code>'w</code> region captures &ldquo;activity&rdquo;.</p>
<p>(Since we would always convert a <code>&amp;'a mut T</code> type into <code>RefMut&lt;'a, 'a, T&gt;</code>, all regions in struct parameters would adopt the more
conservative &ldquo;liveness&rdquo; role to start. This is good because we
wouldn&rsquo;t want to start allowing &ldquo;holes&rdquo; in the lifetimes that unsafe
code is relying on to prevent access from the outside. It would
however be possible for type inference to use a <code>RefMut&lt;'r, 'w ,T&gt;</code>
type as the value for a type parameter; I don&rsquo;t yet see a way for that
to cause any surprises, but perhaps it can if you consider
specialization and other non-parametric features.)</p>
<p>Another example of where this &ldquo;complexity step&rdquo; surfaces came from
<a href="https://www.ralfj.de/blog/">Ralf Jung</a>. As you may know, Ralf is working on a
formalization of Rust as part of the <a href="http://plv.mpi-sws.org/rustbelt/">RustBelt project</a> (if you&rsquo;re
interested, there is video available of a
<a href="https://air.mozilla.org/rust-paris-meetup-35-2017-01-19/">great introduction to this work</a> which Ralf gave at the Rust
Paris meetup). In any case, their model is a kind of generalization of
Rust, in that it can accept a lot of programs that standard Rust
cannot (it is intended to be used for assigning types to unsafe code
as well as safe code). The two-phase borrow proposal that I describe
here should be able to fit into that system in a fairly
straightforward way. But if we adopted discontinuous regions, that
would require making Ralf&rsquo;s system more expressive. This is not
necessarily an argument against doing it, but it does show that it
makes the Rust system qualitatively more complex to reason about.</p>
<p>If all this talk of &ldquo;steps in complexity&rdquo; seems abstract, I think that
the most immediate way it will surface is when we try to
<strong>teach</strong>. Supporting discontinous borrows just makes it that much
harder to craft small examples that show how borrowing works. It will
make the system feel more mysterious, since the underlying rules are
indeed more complex and thus harder to &ldquo;intuit&rdquo; on your own.</p>
<h4 id="two-phase-lifetimes-without-discontinuous-borrows">Two-phase lifetimes without discontinuous borrows</h4>
<p>For a while I was planning to describe a variant on arielb1&rsquo;s proposal
where the write lifetimes were required to be continuous &ndash; in effect,
they would be required to be a suffix of the overall read lifetime;
this would make the proposal roughly equivalent to the current one.
Given that the set of programs that are accepted are the same, this
becomes more a question of <strong>presentation</strong> than anything.</p>
<p>I ultimately settled on the current presentation because it seems
simpler to me. In particular, lifetime inference today is based solely
on <strong>liveness</strong>, which is a &ldquo;forward-looking property&rdquo;. In other
words, something is live if it may be used <strong>later</strong>. In contrast, the
borrow check today is interested in tracking, at a particular point,
the &ldquo;backwards-looking property&rdquo; of whether something has been
borrowed. So adding another &ldquo;backwards-looking property&rdquo; &ndash; whether
that borrow has been activated &ndash; fits borrowck quite naturally.<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></p>
<h3 id="possible-future-extensions">Possible future extensions</h3>
<p>There are two primary ways I see that we might extend this proposal in
the future. The first would be to allow &ldquo;discontinuous borrows&rdquo;, as I
described in the previous section under the heading &ldquo;Two-phase
lifetimes&rdquo;.</p>
<p>The other would be to apply the concept of reservations to <strong>all</strong>
borrows, and to loosen the restrictions we impose on a &ldquo;reserved&rdquo;
path. In this proposal, I chose to treat reserved and shared paths in
the same way. This implies that some forms of nesting do not work; for
example, as we saw in the examples, one cannot write
<code>vec.get({vec.push(2); 0})</code>. These conditions are stronger than is
strictly needed to prevent memory safety violations. We could consider
reserved borrows to be something akin to the old <code>const</code> borrows we
used to support: these would permit reads <strong>and</strong> writes of the
original path, but not moves. There are some tricky cases to be
careful of (for example, if you reserve <code>*b</code> where <code>b: Box&lt;i32&gt;</code>, you
cannot permit people to mutate <code>b</code>, because that would cause the
existing value to be dropped and hence invalidate your existing
reference to <code>*b</code>), but it seems like there is nothing fundamentally
stopping us. I did not propose this because (a) I would prefer not to
introduce a third class of borrow restrictions and (b) most examples
which would benefit from this change seem quite artificial and not
entirely desirable (though there are exceptions). Basically, it seems
ok for <code>vec.get({vec.push(2); 0})</code> to be an error. =)</p>
<h3 id="conclusion">Conclusion</h3>
<p>I have presented here a simple proposal that tries to address the
&ldquo;nested method call&rdquo; problem as part of the NLL work, without
modifying the desugaring into MIR at all (or changing MIR&rsquo;s dynamic
semantics). It works by augmenting the borrow checker so that mutable
borrows begin as &ldquo;reserved&rdquo; and then, on first use, convert to active
status. While the borrows are reserved, they impose the same
restrictions as a shared borrow.</p>
<p>In terms of the &ldquo;overall plans&rdquo; for NLL, I consider this to be the
second out of a series of three posts that lay out a complete proposal<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup>:</p>
<ul>
<li><a href="https://smallcultfollowing.com/babysteps/
/blog/2017/02/21/non-lexical-lifetimes-using-liveness-and-location/">the core NLL system</a>, covered in the previous post;</li>
<li>nested method calls, this post;</li>
<li>incorporating dropck, still to come.</li>
</ul>
<p><strong>Comments?</strong> Let&rsquo;s use <a href="https://internals.rust-lang.org/t/blog-post-nested-method-calls-via-two-phase-borrowing/4886">this internals thread for comments</a>.</p>
<h3 id="footnotes">Footnotes</h3>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>arielb1 called it <code>Ref2Φ&lt;'immut, 'mutbl, T&gt;</code>, but I&rsquo;m going to take the liberty of renaming it.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>arielb1 also proposed to unify <code>&amp;T</code> into this type, but that introduces complications because <code>&amp;T</code> are <code>Copy</code> but <code>&amp;mut</code> are not, so i&rsquo;m leaving that out too.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>In more traditional compiler terminology,
&ldquo;forwards-looking properties&rdquo; are ones computed using
a reverse data-flow analysis, and &ldquo;backwards-looking
properties&rdquo; are those that would be computed by a
forwards data-flow analysis.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>Presuming I&rsquo;m not overlooking something. =)
[ref2]: <a href="https://internals.rust-lang.org/t/accepting-nested-method-calls-with-an-mut-self-receiver/4588/24?u=nikomatsakis">https://internals.rust-lang.org/t/accepting-nested-method-calls-with-an-mut-self-receiver/4588/24?u=nikomatsakis</a>
[wil]: https://smallcultfollowing.com/babysteps/
/blog/2017/02/21/non-lexical-lifetimes-using-liveness-and-location/#step-0-what-is-a-lifetime&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/nll" term="nll" label="NLL"/></entry><entry><title type="html">Non-lexical lifetimes using liveness and location</title><link href="https://smallcultfollowing.com/babysteps/blog/2017/02/21/non-lexical-lifetimes-using-liveness-and-location/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2017/02/21/non-lexical-lifetimes-using-liveness-and-location/</id><published>2017-02-21T00:00:00+00:00</published><updated>2017-02-21T00:00:00+00:00</updated><content type="html"><![CDATA[<p>At the recent compiler design sprint,
<a href="https://public.etherpad-mozilla.org/p/rust-compiler-design-sprint-paris-2017-nll">we spent some time discussing <strong>non-lexical lifetimes</strong></a>,
the plan to make Rust&rsquo;s lifetime system significantly more advanced. I
want to write-up those plans here, and give some examples of the kinds
of programs that would now type-check, along with some that still will
not (for better or worse).</p>
<p>If you were at the sprint, then the system I am going to describe in
this blog post will actually sound quite a bit different than what we
were talking about. However, I believe it is equivalent to that
system. I am choosing to describe it differently because this version,
I believe, would be significantly more efficient to implement (if
implemented naively). I also find it rather easier to understand.</p>
<p>I have a <a href="https://github.com/nikomatsakis/nll">prototype implementation</a> of this system. The example
used in this post, along with the ones from previous posts, have all
been tested in this prototype and work as expected.</p>
<h3 id="yet-another-example">Yet another example</h3>
<p>I&rsquo;ll start by giving an example that illustrates the system pretty
well, I think. This section also aims to give an intution for how the
system works and what set of programs will be accepted without going
into any of the details. Somewhat oddly, I&rsquo;m going to number this
example as &ldquo;Example 4&rdquo;. This is because my <a href="http://smallcultfollowing.com/babysteps/blog/2016/04/27/non-lexical-lifetimes-introduction/">previous post</a>
introduced examples 1, 2, and 3. If you&rsquo;ve not read that post, you may
want to, but don&rsquo;t feel you have to. The presentation in this post is
intended to be independent.</p>
<h4 id="example-4-redefined-variables-and-liveness">Example 4: Redefined variables and liveness</h4>
<p>I think the key ingredient to understanding how NLL should work is
understanding <strong>liveness</strong>. The term &ldquo;liveness&rdquo; derives from compiler
analysis, but it&rsquo;s fairly intuitive. We say that <strong>a variable is live
if the current value that it holds may be used later</strong>. This is very
important to Example 4:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">foo</span><span class="p">,</span><span class="w"> </span><span class="n">bar</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">p</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">foo</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// `p` is live here: its value may be used on the next line.
</span></span></span><span class="line"><span class="cl"><span class="k">if</span><span class="w"> </span><span class="n">condition</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// `p` is live here: its value will be used on the next line.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">print</span><span class="p">(</span><span class="o">*</span><span class="n">p</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// `p` is DEAD here: its value will not be used.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">p</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">bar</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// `p` is live here: its value will be used later.
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// `p` is live here: its value may be used on the next line.
</span></span></span><span class="line"><span class="cl"><span class="n">print</span><span class="p">(</span><span class="o">*</span><span class="n">p</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// `p` is DEAD here: its value will not be used.
</span></span></span></code></pre></div><p>Here you see a variable <code>p</code> that is assigned in the beginning of the
program, and then maybe re-assigned during the <code>if</code>. The key point is
that <code>p</code> becomes <strong>dead</strong> (not live) in the span before it is
reassigned.  This is true even though the variable <code>p</code> will be used
again, because the <strong>value</strong> that is in <code>p</code> will not be used.</p>
<p>So how does liveness relate to non-lexical lifetimes? The key rule is
this: <strong>Whenever a variable is live, all references that it may
contain are live.</strong> This is actually a finer-grained notion than just
the liveness of a variable, as we will see. For example, the first
assignment to <code>p</code> is <code>&amp;foo</code> &ndash; we want <code>foo</code> to be borrowed everywhere
that this assignment may later be accessed. This includes both
<code>print()</code> calls, but excludes the period after <code>p = &amp;bar</code>. Even though
the variable <code>p</code> is live there, it now holds a different reference:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">foo</span><span class="p">,</span><span class="w"> </span><span class="n">bar</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">p</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">foo</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// `foo` is borrowed here, but `bar` is not
</span></span></span><span class="line"><span class="cl"><span class="k">if</span><span class="w"> </span><span class="n">condition</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">print</span><span class="p">(</span><span class="o">*</span><span class="n">p</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// neither `foo` nor `bar` are borrowed here
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">p</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">bar</span><span class="p">;</span><span class="w">   </span><span class="c1">// assignment 1
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// `foo` is not borrowed here, but `bar` is
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// both `foo` and `bar` are borrowed here
</span></span></span><span class="line"><span class="cl"><span class="n">print</span><span class="p">(</span><span class="o">*</span><span class="n">p</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// neither `foo` nor `bar` are borrowed here,
</span></span></span><span class="line"><span class="cl"><span class="c1">// as `p` is dead
</span></span></span></code></pre></div><p>Our analysis will begin with the liveness of a variable (the
coarser-grained notion I introduced first). However, it will use
reachability to refine that notion of liveness to obtain the liveness
of individual <strong>values</strong>.</p>
<h4 id="control-flow-graphs-and-point-notation">Control-flow graphs and point notation</h4>
<p>Recall that in NLL-land, all reasoning about lifetimes and borrowing
will take place in the context of <a href="https://blog.rust-lang.org/2016/04/19/MIR.html">MIR</a>, in which programs are represented
as a control-flow graph. This is what Example 4 looks like as a control-flow graph:</p>
<pre tabindex="0"><code>// let mut foo: i32;
// let mut bar: i32;
// let p: &amp;i32;

A
[ p = &amp;foo     ]
[ if condition ] ----\ (true)
       |             |
       |     B       v
       |     [ print(*p)     ]
       |     [ ...           ]
       |     [ p = &amp;bar      ]
       |     [ ...           ]
       |     [ goto C        ]
       |             |
       +-------------/
       |
C      v
[ print(*p)    ]
[ return       ]
</code></pre><p>As a reminder, I will use a notation like <code>Block/Index</code> to refer to a
specific point (statement) in the control-flow graph. So <code>A/0</code> and
<code>B/2</code> refer to <code>p = &amp;foo</code> and <code>p = &amp;bar</code>, respectively. Note that
there is also a point for the goto/return terminators of each block
(i.e., A/1, B/4, and C/1).</p>
<p>Using this notation, we can say that we want <code>foo</code> to be borrowed
during the points A/1, B/0, and C/0. We want <code>bar</code> to be borrowed
during the points B/3, B/4, and C/0.</p>
<h3 id="defining-the-nll-analysis">Defining the NLL analysis</h3>
<p>Now that we have our two examples, let&rsquo;s work on defining how the NLL
analysis will work.</p>
<h4 id="step-0-what-is-a-lifetime">Step 0: What is a lifetime?</h4>
<p>The lifetime of a reference is defined in our system to be a <strong>region
of the control-flow graph</strong>. We will represent such regions as a set
of points.</p>
<p>A note on terminology: For the remainder of this post, I will often
use the term <strong>region</strong> in place of &ldquo;lifetime&rdquo;. Mostly this is because
it&rsquo;s the standard academic term and it&rsquo;s often the one I fall back to
when thinking more formally about the system, but it also feels like a
good way to differentiate the lifetime of the <strong>reference</strong> (the
region where it is in use) with the lifetime of the <strong>referent</strong> (the
span of time before the underlying resource is freed).</p>
<h4 id="step-1-instantiate-erased-regions">Step 1: Instantiate erased regions</h4>
<p>The plan for adopting NLL is to do type-checking in two phases.  The
first phase, which is performed on the HIR, I would call <strong>type
elaboration</strong>. This is basically the &ldquo;traditional type-system&rdquo;
phase. It infers the types of all variables and other things, figures
out where autoref goes, and so forth; the result of this is the MIR.</p>
<p>The key change from today is that I want to do all of this type
elaboration using erased regions. That is, until we build the MIR, we
won&rsquo;t have any regions at all. We&rsquo;ll just keep a placeholder (which
I&rsquo;ll write as <code>'erased</code>). So if you have something like <code>&amp;i32</code>, the
elaborated, internal form would just be <code>&amp;'erased i32</code>. This is quite
different from today, where the elaborated form includes a specific
region. (However, this erased form is precisely what we want for
generating code, and indeed MIR today goes through a &ldquo;region erasure&rdquo;
step; this step would be unnecessary in the new plan, since MIR as
produced by type check would always have fully erased regions.)</p>
<p>Once we have built MIR, then, the idea is roughly to go and replace
all of these erased regions with inference variables. This means we&rsquo;ll
have region inference variables in the types of all local variables;
it also means that for each borrow expression like <code>&amp;foo</code>, we&rsquo;ll have
a region representing the lifetime of the resulting reference. I&rsquo;ll
write the expression together with this region like so: <code>&amp;'0 foo</code>.</p>
<p>Here is what the CFG for Example 4 looks like with regions
instantiated.  You can see I used the variable <code>'0</code> to represent the
region in the type of <code>p</code>, and <code>'1</code> and <code>'2</code> for the regions of the
two borrows:</p>
<pre tabindex="0"><code>// let mut foo: i32;
// let mut bar: i32;
// let p: &amp;&#39;0 i32;

A
[ p = &amp;&#39;1 foo  ]
[ if condition ] ----\ (true)
       |             |
       |     B       v
       |     [ print(*p)     ]
       |     [ ...           ]
       |     [ p = &amp;&#39;2 bar   ]
       |     [ ...           ]
       |     [ goto C        ]
       |             |
       +-------------/
       |
C      v
[ print(*p)    ]
[ return       ]
</code></pre><h4 id="step-2-introduce-region-constraints">Step 2: Introduce region constraints</h4>
<p>Now that we have our region variables, we have to introduce
constraints.  These constriants will come in two kinds:</p>
<ul>
<li>liveness constraints; and,</li>
<li>subtyping constraints.</li>
</ul>
<p>Let&rsquo;s look at each in turn.</p>
<h4 id="liveness-constraints">Liveness constraints.</h4>
<p>The basic rule is this: <strong>if a variable is live on entry to a point P,
then all regions in its type must include P</strong>.</p>
<p>Let&rsquo;s continue with Example 4. There, we have just one variable, <code>p</code>.
It&rsquo;s type has one region (<code>'0</code>) and it is live on entry to A/1, B/0,
B/3, B/4, and C/0. So we wind up with a constraint like this:</p>
<pre><code>{A/1, B/0, B/3, B/4, C/0} &lt;= '0
</code></pre>
<p>We also include a rule that for each borrow expression like <code>&amp;'1 foo</code>,
<code>'1</code> must include the point of borrow. This gives rise to two further
constraints in Example 4:</p>
<pre><code>{A/0} &lt;= '1
{B/2} &lt;= '2
</code></pre>
<h4 id="location-aware-subtyping-constraints">Location-aware subtyping constraints</h4>
<p>The next thing we do is to go through the MIR and establish the normal
subtyping constraints. However, we are going to do this with a slight
twist, which is that we are going to take the current location into
account. That is, instead of writing <code>T1 &lt;: T2</code> (<code>T1</code> is required to
be a subtype of <code>T2</code>) we will write <code>(T1 &lt;: T2) @ P</code> (<code>T1</code> is required
to be a subtype of <code>T2</code> at the point P). This in turn will translate
to region constraints like <code>(R2 &lt;= R1) @ P</code>.</p>
<p>Continuing with Example 4, there are a number of places where
subtyping constraints arise. For example, at point A/0, we have <code>p = &amp;'1 foo</code>. Here, the type of <code>&amp;'1 foo</code> is <code>&amp;'1 i32</code>, and the type of
<code>p</code> is <code>&amp;'0 i32</code>, so we have a (location-aware) subtyping constraint:</p>
<pre><code>(&amp;'1 i32 &lt;: &amp;'0 i32) @ A/1
</code></pre>
<p>which in turn implies</p>
<pre><code>('0 &lt;= '1) @ A/1 // Note the order is reversed.
</code></pre>
<p>Note that the point here is A/1, not A/0. This is because A/1 is <strong>the
first point in the CFG where this constraint must hold on entry</strong>.</p>
<p>The meaning of a region constraint like <code>('0 &lt;= '1) @ P</code> is that,
starting from the point P, the region <code>'1</code> must include all points
that are reachable without leaving the region <code>'0</code>. The implementation
basically does a depth-first search starting from P; the search stops
if we exit the region <code>'0</code>. Otherwise, for each point we find, we add
it to <code>'1</code>.</p>
<p>Jumping back to example 4, we wind up with two constraints in total.
Combining those with the liveness constraint, we get this:</p>
<pre><code>('0 &lt;= '1) @ A/1
('0 &lt;= '2) @ B/3
{A/1, B/0, B/3, B/4, C/0} &lt;= '0
{A/0} &lt;= '1
{B/2} &lt;= '2
</code></pre>
<p>We can now try to find the smallest values for <code>'0</code>, <code>'1</code>, and <code>'2</code>
that will make this true. The result is:</p>
<pre><code>'0 = {A/1, B/0, B/3, B/4, C/0}
'1 = {A/0, A/1, B/0, C/0}
'2 = {B/3, B/4, C/0}
</code></pre>
<p><strong>These results are exactly what we wanted.</strong> The variable <code>foo</code> is
borrowed for the region <code>'1</code>, which does not include B/3 and B/4.
This is true even though the <code>'0</code> includes those points; this is
because you cannot reach B/3 and B/4 from A/1 without going through
B/1, and <code>'0</code> does not include B/1 (because <code>p</code> is not live at
B/1). Similarly, <code>bar</code> is borrowed for the region <code>'2</code>, which begins
at B/4 and extends to C/0 (and need not include earlier points, which
are not reachable).</p>
<p>You may wonder why we do not have to include <strong>all</strong> points in <code>'0</code> in
<code>'1</code>. Intuitively, the reasoning here is based on liveness: <code>'1</code> must
ultimately include all points where the reference may be accessed. In
this case, the subregion constraint arises because we are copying a
reference (with region <code>'1</code>) into a variable (let&rsquo;s call it <code>x</code>) whose
type includes the region <code>'0</code>, so we need reads of <code>'0</code> to also be
counted as reads of <code>'1</code> &ndash; <strong>but, crucially, only those reads that
may observe this write</strong>. Because of the liveness constraints we saw
earlier, if <code>x</code> will later be read, then <code>x</code> must be live along the
path from this copy to that read (by the definition of liveness,
essentially). Therefore, because the variable is live, <code>'0</code> will
include that entire path. Hence, by including the points in <code>'0</code> that
are reachable from the copy (without leaving <code>'0</code>), we include all
potential reads of interest.</p>
<h3 id="conclusion">Conclusion</h3>
<p>This post presents a system for computing non-lexical lifetimes. It
assumes that all regions are erased when MIR is created. It uses only
simple compiler concepts, notably liveness, but extends the subtyping
relation to take into account <strong>where</strong> the subtyping must hold. This
allows it to disregard unreachable portions of the control-flow.</p>
<p>I feel pretty good about this iteration. Among other things, it seems
so simple I can&rsquo;t believe it took me this long to come up with
it. This either means that is it the right thing or I am making some
grave error. If it&rsquo;s the latter people will hopefully point it out to
me. =) It also seems to be efficiently implementable.</p>
<p>I want to emphasize that this system is the result of a lot of
iteration with a lot people, including (but not limited to) Cameron
Zwarich, Ariel Ben-Yehuda, Felix Klock, Ralf Jung, and James Miller.</p>
<p>It&rsquo;s interesting to compare this with various earlier attempts:</p>
<ul>
<li>Our earliest thoughts assumed continuous regions (e.g., <a href="https://github.com/rust-lang/rfcs/pull/396">RFC 396</a>).
The idea was that the region for a reference ought to correspond to
some continuous bit of control-flow, rather than having &ldquo;holes&rdquo; in
the middle.
<ul>
<li>The example in this post shows the limitation of this,
however. Note that the region for the variable <code>p</code>
includes B/0 and B/4 but excludes B/1.</li>
<li>This is why we lean on <strong>liveness requirements</strong> instead, so as to
ensure that the region contains all paths from where a reference is
created to where it is eventually dereferenced.</li>
</ul>
</li>
<li>An alternative solution might be to consider continuous regions but apply
an SSA or SSI transform.
<ul>
<li>This allows the example in this post to type, but it falls down on
more advanced examples, such as <a href="https://github.com/nikomatsakis/nll/blob/a6609ab17fd483f8d47ef919af3838bf214954e5/test/vec-push-ref.nll">vec-push-ref</a> (hat tip,
Cameron Zwarich). In particular, it&rsquo;s possible for subregion
relations to arise without a variable being redefined.</li>
<li>You can go farther, and give variables a distinct type at
each point in the program, as in Ericson2314&rsquo;s
<a href="https://github.com/Ericson2314/a-stateful-mir-for-rust">stateful MIR for Rust</a>. But even then you must contend with
invariance or you have the same sort of problems.</li>
<li>Exploring this led to the development of the &ldquo;localized&rdquo; subregion
relationship constraint <code>(r1 &lt;= r2) @ P</code>, which I had in mind
<a href="http://smallcultfollowing.com/babysteps/blog/2016/05/09/non-lexical-lifetimes-adding-the-outlives-relation/">in my original series</a> but which we elaborated more fully at the
rustc design sprint.</li>
<li>The change in this post versus what we said at the sprint is that
I am using one type per variable instead of one type per variable
per statement; I am also explicitly using the results of an
earlier liveness analysis to construct the constraints, whereas in
the sprint we incorporated the liveness into the region inference
itself (by reasoning about which values were live across each
individual statement and thus creating many more inequalities).</li>
</ul>
</li>
</ul>
<p>There are some things I&rsquo;ve left out of this post. Hopefully I will get
to them in future posts, but they all seem like relatively minor
twists on this base model.</p>
<ul>
<li>I&rsquo;d like to talk about how to incorporate lifetime parameters on fns
(I think we can do that in a fairly simple way by modeling them as
regions in an expanded control-flow graph,
<a href="https://github.com/nikomatsakis/nll/blob/master/test/get-default.nll">as illustrated by this example in my prototype</a>).</li>
<li>There are various options for modeling the
<a href="https://internals.rust-lang.org/t/accepting-nested-method-calls-with-an-mut-self-receiver/4588">&ldquo;deferred borrows&rdquo; needed to accept <code>vec.push(vec.len())</code></a>.</li>
<li>We might consider a finer-grained notion of liveness that operates
not on variables but rather on the &ldquo;fragments&rdquo; (paths) that we use
when doing move-checking. This would help to make <code>let (p, q) = (&amp;foo, &amp;bar)</code> and <code>let pair = (&amp;foo, &amp;bar)</code> entirely equivalent (in
the system as I described it, they are not, because whenever <code>pair</code>
is live, both <code>foo</code> and <code>bar</code> would be borrowed, even if only
<code>pair.0</code> is ever used).  But even if we do this there will still be
cases where storing pointers into an aggregate (e.g., a struct) can
lose precision versus using variables on the stack, so I&rsquo;m not sure
it&rsquo;s worth worrying about.</li>
</ul>
<p>Comments? <a href="https://internals.rust-lang.org/t/non-lexical-lifetimes-based-on-liveness/3428">Let&rsquo;s use this old internals thread.</a></p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/nll" term="nll" label="NLL"/></entry><entry><title type="html">Project idea: datalog output from rustc</title><link href="https://smallcultfollowing.com/babysteps/blog/2017/02/17/project-idea-datalog-output-from-rustc/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2017/02/17/project-idea-datalog-output-from-rustc/</id><published>2017-02-17T00:00:00+00:00</published><updated>2017-02-17T00:00:00+00:00</updated><content type="html"><![CDATA[<p>I want to have a tool that would enable us to answer all kinds of queries about the structure of Rust code that exists in the wild. This should cover everything from synctactic queries like &ldquo;How often do people write <code>let x = if { ... } else { match foo { ... } }</code>?&rdquo; to semantic queries like &ldquo;How often do people call unsafe functions in another module?&rdquo;  I have some ideas about how to build such a tool, but (I suspect) not enough time to pursue them. I&rsquo;m looking for people who might be interested in working on it!</p>
<p>The basic idea is to build on <a href="https://en.wikipedia.org/wiki/Datalog">Datalog</a>. Datalog, if you&rsquo;re not familiar with it, is a very simple scheme for relating facts and then performing analyses on them. It has a bunch of high-performance implementations, notably <a href="https://github.com/oracle/souffle">souffle</a>, which is also available on GitHub. (Sadly, it generates C++ code, but maybe we&rsquo;ll fix that another day.)</p>
<p>Let me work through a simple example of how I see this working. Perhaps we would like to answer the question: How often do people write tests in a separate file (<code>foo/test.rs</code>) versus an inline module (<code>mod test { ... }</code>)?</p>
<p>We would (to start) have some hacked up version of rustc that serializes the HIR in Datalog form. This can include as much information as we would like. To start, we can stick to the syntactic structures. So perhaps we would encode the module tree via a series of facts like so:</p>
<pre tabindex="0"><code>// links a module with the id `id` to its parent `parent_id`
ModuleParent(id, parent_id).
ModuleName(id, name).

// specifies the file where a given `id` is located
File(id, filename).
</code></pre><p>So for a module structure like:</p>
<pre tabindex="0"><code>// foo/mod.rs:
mod test;

// foo/test.rs:
#[test] 
fn test() { }
</code></pre><p>we might generate the following facts:</p>
<pre tabindex="0"><code>// module with id 0 has name &#34;&#34; and is in foo/mod.rs
ModuleName(0, &#34;&#34;).
File(0, &#34;foo/mod.rs&#34;).

// module with id 1 is in foo/test.rs,
// and its parent is module with id 0.
ModuleName(1, &#34;test&#34;).
ModuleParent(1, 0).
File(1, &#34;foo/test.rs&#34;).
</code></pre><p>Then we can write a query to find all the modules named test which are in a different file from their parent module:</p>
<pre tabindex="0"><code>// module T is a test module in a separate file if...
TestModuleInSeparateFile(T) :-
    // ...the name of module T is test, and...
    ModuleName(T, &#34;test&#34;),
    // ...it is in the file T_File... 
    File(T, T_File),
    // ...it has a parent module P, and...
    ModuleParent(T, P),
    // ...the parent module P is in the file P_File... 
    File(P, P_File),
    // ...and file of the parent is not the same as the file of the child.
    T_File != P_File.
</code></pre><p>Anyway, I&rsquo;m waving my hands here, and probably getting datalog syntax all wrong, but you get the idea!</p>
<p>Obviously my encoding here is highly specific for my particular query. But eventually we can start to encode all kinds of information this way. For example, we could encode the types of every expression, and what definition each path resolved to. Then we can use this to answer all kinds of interesting queries. For example, some things I would like to use this for right now (or in the recent past):</p>
<ul>
<li>Evaluating new lifetime elision rules.</li>
<li>Checking what kinds of unsafe code patterns exist in real life and how frequently.</li>
<li>Checking how much might benefit from <a href="https://github.com/rust-lang/rfcs/pull/1712">accepting the <code>else match { ... }</code> RFC</a></li>
<li>Testing how much code in the wild might be affected by <a href="https://github.com/rust-lang/rfcs/pull/1603">deprecating <code>Trait</code> in favor of <code>dyn Trait</code></a></li>
</ul>
<p>So, you interested? If so, contact me &ndash; either privmsg over IRC
(<code>nmatsakis</code>) or
<a href="https://internals.rust-lang.org/t/project-idea-datalog-output-from-rustc/4805">over on the internals threads I created</a>.</p>
]]></content></entry><entry><title type="html">Compiler design sprint summary</title><link href="https://smallcultfollowing.com/babysteps/blog/2017/02/12/compiler-design-sprint-summary/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2017/02/12/compiler-design-sprint-summary/</id><published>2017-02-12T00:00:00+00:00</published><updated>2017-02-12T00:00:00+00:00</updated><content type="html"><![CDATA[<p>This last week we had the <strong>rustc compiler team design sprint</strong>.  This
was our second rustc compiler team sprint; the first one (last year)
we simply worked on pushing various projects over the finish line (for
example, in an epic effort, arielb1 completed dynamic drop during that
sprint).</p>
<p>This sprint was different: we had the goal of talking over many of the
big design challenges that we&rsquo;d like to tackle in the upcoming year
and making sure that the compiler team was roughly on board with the
best way to implement them.</p>
<p>I or others will be trying to write up many of the details in various
forums, either on this blog or perhaps on internals etc, but I thought
it&rsquo;d be fun to start with a quick post that describes the overall
topics of discussion. For each one, I&rsquo;ll give a quick summary and,
where possible, point you at the minutes and notes that we took.</p>
<h3 id="on-demand-processing-and-incremental-compilation">On-demand processing and incremental compilation</h3>
<p>The first topic of discussion was perhaps the most massive, in terms
of its impact on the codebase. The goal is to reorient how rustc works
internally completely. Right now, like many compilers, rustc works by
running a series of <strong>passes</strong>, one after the other. So for example we
first parse, then do macro expansion and name resolution (these used
to be distinct, but have now become interwoven as part of the work on
macros 2.0), then type-checking, and so forth. This is a time-honored
approach, but it&rsquo;s beginning to show its age:</p>
<ul>
<li>Some parts of the compiler front-end cannot be so neatly separated.
I already mentioned how macro expansion and name resolution are now
interdependent (you have to resolve the path that leads to a macro
to know which macro to expand). Similar things arise in
type-checking, particularly as we aim to support constant
expressions in types.  In that case, we have to type-check the
constant expression, but it must also be part of a type, and so
forth.</li>
<li>For better IDE support, it is desirable to be able to compile just
what is needed to type-check a particular function (we can come back
and cleanup the rest later).</li>
<li>Things like <code>impl Trait</code> make the type-checking of some functions
partially dependent on the results of others, so the old approach of
type-checking all function bodies in an arbitrary order doesn&rsquo;t work.</li>
</ul>
<p>The idea is to replace it with <strong>on-demand</strong> compilation, which
basically means that we will have a graph of &ldquo;things we might want to
compute&rdquo; (for example, &ldquo;does the function <code>foo</code> type-check&rdquo;). We can
&ldquo;demand&rdquo; any one of these &ldquo;queries&rdquo;, and the compiler will go and do
what it has to do to figure out the answer. That may involve
satisfying other queries internally (hopefully without cycles). In the
end, your entire type-check will complete, but the order in which we
do the compiler will be far less specified.</p>
<p>This idea for on-demand compilation naturally dovetails with the plans
for the next generation of incremental compilation. The current design
is similar to make: when a change is made, we eagerly propagate the
effect of that change, throwing away any old results that might have
been affected.  Often, though, we don&rsquo;t know that the old results
<strong>would have been</strong> affected.  It frequently happens that one makes
changes which only affect some parts of a result: e.g., a change to a
fn body that just renames some variables might still wind up
generating precisely the same MIR in the end.</p>
<p>Under the newer scheme, the idea is to limit the spread of changes.
If the inputs to a particular computation change, we do indeed have to
re-run the computation, but we can check if its output is different
from the output we have saved. If not, we don&rsquo;t have to dirty things
that were dependent on the computation. (The scheme we wound up with
can be considered a specialized variant of
<a href="http://adapton.org/">Adapton</a>, which is a very cool Rust and Ocaml
library for doing generic incrementalized computation.)</p>
<p>Links:</p>
<ul>
<li><a href="https://public.etherpad-mozilla.org/p/rust-compiler-design-sprint-paris-2017-odi">etherpad</a></li>
</ul>
<h3 id="supporting-alternate-backends">Supporting alternate backends</h3>
<p>We spent some time discussing how to integrate alternate backends
(e.g., <a href="https://github.com/stoklund/cretonne">Cretonne</a>, <a href="http://webassembly.org/">WASM</a>, and &ndash; in its own way &ndash; <a href="https://github.com/tsion/miri">miri</a>.). Now that
we have MIR, a lot of the hard work is done: the translation from MIR
to LLVM is fairly straightforward, and the translation from MIR to
Cretonne or WASM might be even more simple (particularly since eddyb
already made the code that computes field and struct layouts be
independent from LLVM).</p>
<p>There are still some parts of the system that we will need to factor out
from <code>librustc_trans</code>. For example, the &ldquo;collector&rdquo;, which is the bit of code
that determines what monomorphizations we need to generate of each function,
is independent from LLVM.</p>
<p>The goal with Cretonne, as <a href="https://internals.rust-lang.org/t/possible-alternative-compiler-backend-cretonne/4275">discussed on internals</a>, is
ultimately to use it as the debug-mode backend. It promises to offer a
very fast, &ldquo;decent quality&rdquo; compilation experience, with LLVM sticking
around as the heavyweight compiler (and to support more
architectures). The plan for Cretonne integration is (most likely) to
begin with a stateless REPL, similar to
<a href="https://play.rust-lang.org">play.rust-lang.org</a> or the playbot on
IRC. The idea would be to take a complete Rust program (i.e., with a
<code>main()</code> function), compile it to a buffer, and execute that. This
avoids the need to generate <code>.o</code> files from Cretonne, since that code
does not exist (Cretonne&rsquo;s first consumer is going to be a JIT, after
all).</p>
<p>After we had finished admiring <a href="https://github.com/stoklund">stoklund</a>&rsquo;s admirable job of writing
clean, documented code in Cretonne, we also dug into some of the
details of how it works. There are still a number of things that are
needed before we can really get this project off the ground (notably:
a register allocator), but in general it is a very nice match with MIR
and also our plans around constant evaluation via miri (discussed in
an upcoming part of this blog post). We discussed how best to maintain
debuginfo, and in particular some of <a href="https://github.com/stoklund">stoklund</a>&rsquo;s very cool ideas to
use the same feature that JITs use to perform de-optimization to track
debuginfo values (which would then guarantee perfect fidelity).</p>
<p>We had the idea that we might enable different backends per
codegen-unit (i.e., per module, in incremental compilation), so that
we can use LLVM to accommodate some of the more annoying features
(e.g., inline assembly) that may not appear in Cretonne any time soon.</p>
<p>Links:</p>
<ul>
<li><a href="https://internals.rust-lang.org/t/possible-alternative-compiler-backend-cretonne/4275">internals thread about Cretonne</a></li>
<li><a href="https://public.etherpad-mozilla.org/p/rust-compiler-design-sprint-paris-2017-mir">etherpad</a></li>
</ul>
<h3 id="mir-optimization">MIR Optimization</h3>
<p>We spent some time &ndash; not as much as I might have liked &ndash; digging
into the idea of optimizing MIR and trying to form an overall
strategy. Almost any optimization we might do requires <em>some</em> notion
of unsafe code guidelines to justify, so one of the things we talked
about was how to &ldquo;separate out&rdquo; that part of the system so that it can
be evolved and tightened as we get a more firm idea of what unsafe
code can and cannot do. The general conclusion was that this could be
done primarily by having some standard dataflow analyses that try to
detect when values &ldquo;escape&rdquo; and so forth &ndash; we would probably start
with a VERY conservative notion that any local which has <em>ever</em> been
borrowed may be mutated by any pointer write or function call, for
example, and then gradually tighten up.</p>
<p>In general, we don&rsquo;t expect rustc to be doing a lot of aggressive
optimization, as we prefer to leave that to the backends like
LLVM. However, we would like to generate better code primarily for the
purposes of improving compilation time. This works because optimizing
MIR is just plain simpler and faster than other IRs, since it is
higher-level, and because it is pre-monomorphization. If we do a good
enough job, it can also help to close the gap between the performance
of debug mode and release mode builds, thus also helping with
compilation time by allowing people to use debug more builds more
often.</p>
<p>Finally, we discussed <a href="https://github.com/rust-lang/rust/pull/39648">aatch&rsquo;s inlining PR</a>, and iterated around
different designs. In particular, we considered an &ldquo;on the fly&rdquo;
inlining design where we did inlining more like a JIT does it, during
the lowering to LLVM (or Cretonne, etc) IR.  Ultimately we deciding
that the current plan (inlining in MIR) seemed best, even though it
involves potentially allocating more data-structures, because it
enables us to optimize (A) before monomorphization, multiplying the
benefit and (B) we can remove a lot of temporaries and so forth, in
particular around small functions like <code>Deref::deref</code>, whereas if we
do the inlining as we lower, we are ultimately leaving that to LLVM to
do.</p>
<ul>
<li><a href="https://public.etherpad-mozilla.org/p/rust-compiler-design-sprint-paris-2017-mir">etherpad</a></li>
</ul>
<h3 id="unsafe-code-guidelines">Unsafe code guidelines</h3>
<p>We spent quite a while discussing various aspects of the intersection
of (theoretical) unsafe code guidelines and the compiler. I&rsquo;ll be
writing up some detailed posts on this topic, so I won&rsquo;t go into much
detail, but I&rsquo;ll leave some high-level notes:</p>
<ul>
<li>We discussed exhaustiveness and made up plans for how to incorporate
the <code>!</code> type there.</li>
<li>We discussed how to ensure that we can still optimize safe code
even in the presence of unsafe code, and what kinds of guarantees
we need to require.
<ul>
<li>Likely the kinds of assertions I was describing in
<a href="http://smallcultfollowing.com/babysteps/blog/2017/02/01/unsafe-code-and-shared-references/">my most recent post on the topic</a> aren&rsquo;t quite right,
and we want the &ldquo;locking&rdquo; approach I began with, but modified to
account for privacy.</li>
</ul>
</li>
<li>We looked some at how LLVM handles dependence analysis and so forth,
and what kinds of rules we would need to ensure that LLVM is not
doing more aggressive optimization than our rules would permit.
<ul>
<li>The LLVM rules we looked at all seem to fall under the rubrik of
&ldquo;LLVM will consider a local variable to have escaped unless it can
prove that it hasn&rsquo;t&rdquo;. What I wonder about is the extent to which
other optimizations might take advantage of the ways that the C
standard technically forbid you to transmute a pointer to a
<code>usize</code> and then back again (or at least forbid you from using the
resulting pointer). Apparently gcc will do <em>some</em> amount of
optimization on this basis, but perhaps not LLVM, though more
investigation is warranted.</li>
</ul>
</li>
</ul>
<p>Links:</p>
<ul>
<li><a href="https://public.etherpad-mozilla.org/p/rust-compiler-design-sprint-paris-2017-ucg">etherpad</a></li>
</ul>
<h3 id="macros-20-hygiene-spans">Macros 2.0, hygiene, spans</h3>
<p>jseyfried called in and filled us in on some of the latest progress
around Macros 2.0. We discussed the best way to track hygiene
information &ndash; in particular, whether we could do it using the same
spans that we use to track line number and column information. In
general I think there was consensus that this could work. =) We also
discussed some of the interactions with privacy and hygiene that arise
when you try to be smarter than our current macro system.</p>
<p>Links:</p>
<ul>
<li><a href="https://public.etherpad-mozilla.org/p/rust-compiler-design-sprint-paris-2017-macros">etherpad</a></li>
</ul>
<h3 id="diagnostic-improvements">Diagnostic improvements</h3>
<p>While talking about spans, we discussed some of the ways we could
address some shortcomings in our current diagnostic output. For
example, we&rsquo;d like to avoid highlighting multiple lines when citing a
method, and instead just underlyine the method name, and that sort of
thing. We&rsquo;d also like to print out types using identifiers local to
the site of the error (i.e., <code>Option&lt;T&gt;</code> and not
<code>::std::option::Option&lt;T&gt;</code>).  Hopefully we&rsquo;ll be converting those
rough plans into mentoring instructions, as these seem like good
starter projects for someone wanting to learn more about how rustc
works.</p>
<p>Links:</p>
<ul>
<li><a href="https://public.etherpad-mozilla.org/p/rust-compiler-design-sprint-paris-2017-macros">etherpad</a>
(scroll down)</li>
</ul>
<h3 id="miri-integration">miri integration</h3>
<p>We discussed integrating the <a href="https://github.com/tsion/miri">miri</a> interpreter. The initial plan is
to have it play a very limited role: simply replacing the current
constant evaluator that lowers to LLVM constants. Since miri produces
basically a big binary blob (possibly with embedded pointers called
&ldquo;redirections&rdquo;), but LLVM wants a higher-level thing, we have to use
some bitcasts and so forth to encode it. This is actually an area
where <a href="https://github.com/stoklund/cretonne">Cretonne&rsquo;s</a> level of abstraction, which is lower than
LLVM, is probably a better fit. But it should all work out fine in any case.</p>
<p>This initial step of using miri as constant evaluator would not change
in any way the set of programs that are accepted, except in so far as
it makes them work better and more reliably. But it does give us the
tools to start handling constants in the front-end as well as a much
wider range of <code>const fn</code> bodies and so forth (possibly even including
limited amounts of unsafe code).</p>
<p>Links:</p>
<ul>
<li><a href="https://public.etherpad-mozilla.org/p/rust-compiler-design-sprint-paris-2017-miri">etherpad</a></li>
</ul>
<h3 id="variable-length-arrays-and-allocas">Variable length arrays and allocas</h3>
<p>We discussed the desire to support allocas (<a href="https://github.com/rust-lang/rfcs/pull/1808">RFC 1808</a>) coupled with
the desire to support unsized types in more locations (in particular
as the types of parameters). We worked through how we would implement
this and what some of the complications might be, and drew up a rough
plan for an extension to the language that would be expressive,
efficiently implementable, and avoid unpredictable rampant stack
growth. This will hopefully makes its way into an RFC soon.</p>
<p>Links:</p>
<ul>
<li><a href="https://public.etherpad-mozilla.org/p/rust-compiler-design-sprint-paris-2017-unsized">etherpad</a></li>
</ul>
<h3 id="non-lexical-lifetimes">Non-lexical lifetimes</h3>
<p>We spent quite a while iterating on the design for non-lexical
lifetimes. I plan to write this up shortly in another blog post, but
the summary is that we think we have a design that we are quite happy
with. It addresses (I believe) all the known examples and even
extends to support <a href="https://internals.rust-lang.org/t/accepting-nested-method-calls-with-an-mut-self-receiver/4588">nested method calls</a> where the outer call has an
<code>&amp;mut self</code> argument (e.g., <code>vec.push(vec.len())</code>, which today do not
compile.</p>
<p>Links:</p>
<ul>
<li><a href="https://public.etherpad-mozilla.org/p/rust-compiler-design-sprint-paris-2017-nll">etherpad</a></li>
</ul>
<h3 id="conclusion">Conclusion</h3>
<p>Those were the main topics of discussion &ndash; pretty exciting stuff!  I
can&rsquo;t wait to see these changes play out over the next year. Thanks to
all the attendees, and particularly those who dialed in remotely at
indecent hours of the day and night (notably jseyfried and nrc) to
accommodate the Parisian time zone.</p>
<p><a href="http://smallcultfollowing.com/babysteps/blog/2017/02/12/compiler-design-sprint-summary/">Comments? Check out the internals thread.</a></p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/></entry><entry><title type="html">Unsafe code and shared references</title><link href="https://smallcultfollowing.com/babysteps/blog/2017/02/01/unsafe-code-and-shared-references/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2017/02/01/unsafe-code-and-shared-references/</id><published>2017-02-01T00:00:00+00:00</published><updated>2017-02-01T00:00:00+00:00</updated><content type="html"><![CDATA[<p>In a previous post, I talked about a <a href="https://smallcultfollowing.com/babysteps/
/blog/2017/01/22/assigning-blame-to-unsafe-code/">proposed approach to drafting the
unsafe code guidelines</a>. Specifically, I want to the approach of having
an <strong>executable specification</strong> of Rust with additional checks that
will signal when undefined behavior has occurred. In this post, I want
to try to dive into that idea a bit more and give some more specifics
of the approach I have in mind. I&rsquo;m going to focus on this post on the
matter of the proper use of shared references <code>&amp;T</code> &ndash; I&rsquo;ll completely
ignore <code>&amp;mut T</code> for now, since those are much more complicated
(because they require a notion of uniqueness).</p>
<p>For the time being, I&rsquo;m going to continue to talk about this
executable specification as a kind of &ldquo;enhanced miri&rdquo;. I think
probably the right <em>formal</em> way to express it is not as code but
rather as an <strong>operational semantics</strong>, which is a basically a
mathematical description of an interpreter. But at the same time I
think we should keep in mind other ways of implementing those same
checks (e.g., as a valgrind plugin).</p>
<p>I&rsquo;m also going to focus on single-thread semantics for now. It seems
best to start there, and extend to the multithreaded case only once we
have a good handle on how we think the sequential semantics ought to
roughly work (perhaps using an operationally-based model like <a href="https://github.com/nikomatsakis/rust-memory-model/issues/32">promises</a> as a starting
point).</p>
<h3 id="how-to-use-shared-references-wrong">How to use shared references wrong</h3>
<p>In Rust, a shared reference is more than a pointer. It&rsquo;s also a kind
of <em>promise</em> to the type system. Specifically, when you create a
shared reference, the data that it refers to (&ldquo;referent&rdquo;) is
considered <em>borrowed</em>, which means that it is supposed to be
<strong>immutable</strong> (except for under an <code>UnsafeCell</code>) and <strong>valid</strong> so long
as the reference is in use. When you&rsquo;re writing <em>safe Rust</em>, of
course, the borrow checker ensures these properties for you:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">p</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">i</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">i</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w"> </span><span class="c1">// &lt;-- Error! `i` is shared, cannot mutate.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;</span><span class="si">{}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="o">*</span><span class="n">p</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>But what about unsafe code? Certainly it is possible to violate either
of these properties. For now, I&rsquo;m going to focus on <em>mutating</em>
borrowed data when you are not supposed to; in fact, freeing or moving
borrowed data can be seen as a kind of mutation (overwriting the data
with uninitialized). So here is a running example of an unsafely
implemented function <code>util::increment()</code>, which takes in a <code>&amp;usize</code>
and increments it:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">increment</span><span class="p">(</span><span class="n">u</span>: <span class="kp">&amp;</span><span class="kt">usize</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">p</span>: <span class="o">*</span><span class="k">const</span><span class="w"> </span><span class="kt">usize</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">u</span><span class="p">;</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">q</span>: <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="kt">usize</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">p</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="kt">usize</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">*</span><span class="n">q</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now, clearly, this is a sketchy function, and I think most would agree
that it should be considered illegal, at least under some
executions. In particular, if nothing else, its existence will
interfere with the compiler&rsquo;s ability to optimize. To see why, imagine
a caller like this one; let&rsquo;s further assume that the source of
<code>increment()</code> is unavailable for analysis (perhaps it is part of
another crate, or a different codegen-unit within the current crate).</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">innocent</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="mi">22</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;i = </span><span class="si">{}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="o">*</span><span class="n">i</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">increment</span><span class="p">(</span><span class="n">i</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;i = </span><span class="si">{}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="o">*</span><span class="n">i</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Ideally, the compiler ought to be able to deduce &ndash; even without
knowing what <code>increment()</code> does &ndash; that <code>*i</code> equals <code>22</code> throughout
this function execution. After all, the underlying temporary that <code>i</code>
points at is clearly only accessed through a shared reference, which
ought to be immutable. But, of course, that is not a valid assumption:
<code>increment()</code> is violating its contract. So if we perform
optimizations, such as replacing all uses of <code>i</code> with the constant
<code>22</code>, those will be visible to the end-user. In typical C fashion,
this can be justified if we say that the program encounters <em>undefined
behavior</em>, but how can we make that more precise?</p>
<h3 id="instrumenting-to-detect-failures">Instrumenting to detect failures</h3>
<p>Earlier we mentioned that the key property of a shared reference is
that the borrowed memory will remain both <em>immutable</em> and <em>valid</em> for
the lifetime of the reference. The way that my mental model works, the
borrow model is kind of like a (compile time) read-write lock: when
you borrow data to create a shared reference, you have acquired a
&ldquo;read-lock&rdquo; on that data. As a first stab at what our &ldquo;augmented
interpreter&rdquo; might look like, let&rsquo;s see if we can realize that
intution. (Spoiler: this will turn out to be the wrong approach.)</p>
<p>The basic idea is that the interpreter would track a &ldquo;reader count&rdquo;
for every bit of memory. When we create a reference (i.e., when we
execute <code>&amp;i</code>), that will instruct the interpreter to increment that
counter. The compiler would also generate &ldquo;release&rdquo; instructions when
the borrow goes out of scope which would decrement the lock count
again.</p>
<p>So in a sense our augmented program would look like this. The new
assertions are written in comments; the interpreter would understand
them, even if regular Rust execution does not:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">innocent</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="mi">22</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;i = </span><span class="si">{}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="n">i</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// acquire_read_lock(&amp;i);
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">increment</span><span class="p">(</span><span class="o">&amp;</span><span class="n">i</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// release_read_lock(&amp;i);
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;i = </span><span class="si">{}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="n">i</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now, once we&rsquo;ve inserted those instructions, then presumably
<code>increment()</code> would dynamically fail as it attempted to execute <code>*q += 1</code>, because the memory was &ldquo;read-locked&rdquo;.</p>
<h3 id="dealing-with-unsafe-abstractions">Dealing with unsafe abstractions</h3>
<p>So, this idea of a read-write lock seems reasonable so far &ndash; why did
I say that this would turn out to be the wrong approach? Well, one
catch is that it&rsquo;s not sufficient in general to just freeze a single
integer. Rather, when something gets borrowed, we have to freeze <em>all
the memory reachable from the point of borrow</em>. That turns out to be
problematic: given that Rust is built on unsafe abstractions, it&rsquo;s not
really <strong>possible</strong> to enumerate all that memory. To see what I mean,
consider this program:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">x</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="nb">Vec</span><span class="o">&lt;</span><span class="kt">i32</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[</span><span class="fm">vec!</span><span class="p">[]];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">x</span><span class="p">;</span><span class="w"> </span><span class="c1">// borrow `x`
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Here, the reference <code>y</code> borrows <code>x</code>, which is a vector of
vectors. This implies that not only is the vector <code>x</code> itself frozen,
so are all the vectors within <code>x</code>, and so are all the integers in all
those vectors. This means that if the program were to create an unsafe
pointer and navigate to any one of those vectors and try to mutate it,
we should error out.</p>
<p>To enforce this, presumably the compiler would have to insert
something like the <code>acquire_read_lock(&amp;x)</code> we saw before. This
instruction would cause the interpreter to navigate to all the memory
reachable from <code>x</code> &ndash; but how can it do that? Vectors, after all, are
not a built-in concept in Rust. The <code>Vec</code> type is just a struct that
stores an unsafe pointer instead, ultimately looking something like this:</p>
<pre tabindex="0"><code class="language-struct" data-lang="struct">struct Vec&lt;T&gt; {
    data: *mut T,
    len: usize,
    capacity: usize,
}
</code></pre><p>It&rsquo;s clear that we can freeze the fields of the <code>Vec</code>, but it&rsquo;s less
clear how we can freeze the vector&rsquo;s data. Is it safe or reasonable
for us to reference <code>data</code>? How do we know that the memory that <code>data</code>
refers to is initialized? (In fact, since vectors over-allocated, some
portion of that data is basically guaranteed to be uninitialized.)</p>
<p>We actually encountered similar issues when thinking about how to
integrate tracing GCs (another topic that would make for a good blog
post!). The bottom line is that whatever scheme you create, people
will always want some way to apply their own customic logic (e.g.,
maybe the pointer isn&rsquo;t stored as a <code>*mut T</code>, it&rsquo;s actually a <code>usize</code>
and you can only extract it by doing an <code>xor</code> with some other
values). So it&rsquo;d really be best if we can avoid the need to
&ldquo;interpret&rdquo; an unsafe data structure in any way.</p>
<h3 id="a-second-approach-cannot-observe-a-violation">A second approach: cannot observe a violation</h3>
<p>There is another way to think about the freezing guarantees. Instead
of <em>eagerly locking</em> all the memory that is reachable through a
reference, we might instead declare that the <em>compiler should not be
able to observe any writes</em>. Under this model, modifying the referent
of an <code>&amp;i32</code> is not &ndash; in and of itself &ndash; undefined behavior. It only
becomes undefined behavior when that reference is later loaded and
observed to have been written since its creation.</p>
<p>One way to express this is to imagine that there is a global counter
<code>WRITES</code> tracking the number of writes to memory. Every time we write
to a memory address <code>m</code>, the interpreter will increment <code>WRITES</code> and
store the new value to a global map <code>LAST_WRITE[m]</code> &ndash; this map
records, for each address, the last time it was written. When we
<em>create</em> a shared reference <code>r</code>, we can also read the current value of
<code>WRITES</code> and associate this value with the reference as
<code>TIME_STAMP[r]</code> (you can think of it as some extra metadata that gets
carried along somehow).</p>
<p>Now, when we <em>read</em> from a shared reference <code>r</code> that refers to the
memory address <code>m</code>, we can check that <code>LAST_WRITE[m] &lt;= TIME_STAMP[r]</code>, which tells us that the memory has not been written
since the reference <code>r</code> was created (this may actually be stricter
than we want, but let&rsquo;s start here).</p>
<p>So, coming back to our running example, the code might look like this,
with comments indicating the meta-operations that are happening:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">innocent</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="mi">22</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// TIME_STAMP[i] = WRITES
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// assert(LAST_WRITES[i] &lt;= TIME_STAMP[i])
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;i = </span><span class="si">{}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="o">*</span><span class="n">i</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">increment</span><span class="p">(</span><span class="o">&amp;</span><span class="n">i</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// assert(LAST_WRITES[i] &lt;= TIME_STAMP[i])
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;i = </span><span class="si">{}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="o">*</span><span class="n">i</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">increment</span><span class="p">(</span><span class="n">u</span>: <span class="kp">&amp;</span><span class="kt">usize</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">p</span>: <span class="o">*</span><span class="k">const</span><span class="w"> </span><span class="kt">usize</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">u</span><span class="p">;</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">q</span>: <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="kt">usize</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">p</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="kt">usize</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// WRITES += 1;
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// LAST_WRITES[q] = WRITES;
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">*</span><span class="n">q</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now we can clearly see that the second assertion in <code>innocent()</code> will
fail, since <code>LAST_WRITES[i]</code> is going to be equal to
<code>TIME_STAMP[i]+1</code>. This indicates that some form of undefined behavior
occurred.</p>
<h3 id="unsafety-levels">Unsafety levels</h3>
<p>One of the premises of the <a href="https://smallcultfollowing.com/babysteps/
/blog/2016/05/27/the-tootsie-pop-model-for-unsafe-code/">Tootsie Pop model</a> is that we can
leverage the fact that Rust separates <em>safe</em> from <em>unsafe</em> code to
allow for more optimization without making it harder to reason about
unsafe code. Although many specific details of the TPM proposal were
flawed, I think this basic idea is still necessary if we are to
achieve the level of optimization that I would like to achieve while
avoiding the problem of unsafe code becoming very hard to reason
about. I plan to write more on this specific topic (&ldquo;safety levels&rdquo;)
in a follow-up post; for now, I want to take for granted that we have
some way to designate &ldquo;safe&rdquo; functions from &ldquo;unsafe&rdquo; functions, and
just talk about how we can reflect that designation using assertions,
and in turn use those assertions to drive optimization.</p>
<p>Consider this variant of the example from
<a href="https://smallcultfollowing.com/babysteps/
/blog/2016/09/12/thoughts-on-trusting-types-and-unsafe-code/">my previous post about trusting types</a>. Let&rsquo;s assume that the
function <code>patsy()</code> here is &ldquo;safe code&rdquo;:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">patsy</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="mi">22</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">v</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">*</span><span class="n">i</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">increment</span><span class="p">(</span><span class="n">i</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;i = </span><span class="si">{}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="n">v</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>In this code, the author has loaded <code>*i</code> <em>before</em> calling
<code>increment()</code>, but the result is not used until afterwards. The
question is, given that this is safe code, can we optimize this code
by deferring the load until later? This kind of optimization could be
useful in improving register allocation and stack size, for example:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">patsy</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="c1">// &#34;optimized&#34;
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="mi">22</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">increment</span><span class="p">(</span><span class="n">i</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">v</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">*</span><span class="n">i</span><span class="p">;</span><span class="w"> </span><span class="c1">// this is moved here
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;i = </span><span class="si">{}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="n">v</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>In general, my goal is that we can drive whether an optimization is
legal based purely on the assertions and things that we are using to
instrument the code when we check for undefined behavior. The idea is
then similar to how C optimization works: we can perform an
optimization if we can show that it only affects executions that would
have resulted in an assertion failure anyhow. So let&rsquo;s see what our
instrumented <code>patsy()</code> looks like so far:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">patsy</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="c1">// instrumented
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="mi">22</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// TIME_STAMP[i] = WRITES
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// assert(TIME_STAMP[i] &lt;= LAST_WRITES[i])
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">v</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">*</span><span class="n">i</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">increment</span><span class="p">(</span><span class="n">i</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;i = </span><span class="si">{}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="n">v</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Based only on these assertions, there is no way to justify the
optimization I want to perform. After all, <code>increment()</code> is free to
update <code>LAST_WRITES[i]</code> because there is no assertion that states
otherwise.</p>
<p>What went wrong? The disconnected is actually strongly related to my
previous post on <a href="https://smallcultfollowing.com/babysteps/
/blog/2016/10/02/observational-equivalence-and-unsafe-code/">observational equivalence</a> &ndash; I would like to
optimize <code>patsy()</code> on the basis that <code>increment()</code>, being declared as
a safe function, will only do things that safe code could do (or,
rather, safe code augmented with the capabilities we define for unsafe
code). That&rsquo;s a pretty strong assumption &ndash; since it assumes we can
fully describe the possible things the code might do &ndash; but we can
weaken it by saying that, since <code>increment()</code> is declared safe, I
should get to assume that <strong>any code</strong> that its callers could write
that type-checks will not trigger undefined behavior. But that is
clearly false, as we saw in the previous section: if the caller simply
moves the <code>let v = *i</code> line down to <em>after</em> <code>increment()</code>, an
assertion failure occurs.</p>
<p>We can capture some of this intution by saying that, in safe code, we
add <strong>additional assertions</strong> at function boundaries. The idea is that
when safe code calls a function (and, by definition, that function
must be safe, since calling an unsafe function requires an unsafe
block), it can rely on that function not to disturb the types that it
has access to. So imagine that after every function call in a safe
function, we assert that all our publicly accessible state is still
valid. In this case, since <code>i</code> is an in-scope reference whose lifetime
has not expired (in particular, even in a <a href="https://smallcultfollowing.com/babysteps/
/blog/2016/04/27/non-lexical-lifetimes-introduction/">NLL world</a>, its lifetime
would include the call to <code>increment()</code>), that means that the memory it
refers to must not have changed:</p>
<p>s</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">patsy</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="c1">// instrumented
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="mi">22</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// TIME_STAMP[i] = WRITES
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// assert(TIME_STAMP[i] &lt;= LAST_WRITES[i])
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">v</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">*</span><span class="n">i</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">increment</span><span class="p">(</span><span class="n">i</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// assert(TIME_STAMP[i] &lt;= LAST_WRITES[i])
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;i = </span><span class="si">{}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="n">v</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Running with these augmented semantics, we see that <code>increment()</code> will
yield an assertion failure once <code>patsy()</code> calls it, even though we
don&rsquo;t access <code>*i</code> again. This in turn justifies our compiler&rsquo;s
decision to move <code>let v = *i</code> below the call.</p>
<p>Its clear that, even with these stronger assertions, we are not able
to fully check that some bit of unsafe code is a valid safe
abstraction. In other words, we can show that it did not disturb the
local variables of its caller function in any immediate way, but it
may well have disturbed them in some way that will show up later (for
example, <code>increment</code> might not immediately mutate <code>*i</code>, but it might
make an alias of <code>i</code> that will be used later to perform an illegal
mutation). However, we can hopefully show that the abstraction is
<em>safe enough</em> for the compiler to do the optimizations we would like
to do.</p>
<h3 id="ginning-up-metadata-for-false-references">Ginning up metadata for false references</h3>
<p>Another question that you quickly run into in this approach &ndash; and
it&rsquo;s a question we have to answer no matter what! &ndash; is what to do
about references that are created in &ldquo;unorthodox&rdquo; ways. For example,
what happens if I make a reference by transmuting a <code>usize</code> (note: not
recommended):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// Don&#39;t do this at home, kids.
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">wacked</span><span class="p">(</span><span class="n">x</span>: <span class="kp">&amp;</span><span class="nc">T</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kp">&amp;</span><span class="nc">T</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">i</span>: <span class="kt">usize</span> <span class="o">=</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="o">*</span><span class="k">const</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="kt">usize</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">y</span>: <span class="kp">&amp;</span><span class="nc">T</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">transmute</span><span class="p">(</span><span class="n">i</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">y</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>If the reference <code>x</code> has some &ldquo;identity&rdquo; as a reference, you can
imagine that the machine might preserve that identity when <code>x</code> is cast
to a <code>usize</code>, in which case <code>TIME_STAMPS[y] == TIME_STAMPS[x]</code>. Or
perhaps the time stamp is reset. This is all strongly related to C
memory models (e.g., <a href="https://github.com/nikomatsakis/rust-memory-model/issues/30">this one</a>), which also have to
define this sort of thing (related question: at what point does a
pointer gain a numeric address?).</p>
<p>In any case, I&rsquo;m not sure just what the right answer is here, but I
like how focusing on something executable makes the issue at hand very
concrete. It also seems like that, as we thread this data through an
actual interpreter, these questions will naturally arise (i.e., &ldquo;hmm,
we have to create a <code>Reference</code> value here, what should we use for the
time-stamp?&rdquo;), which will help give us confidence that we have
convered the various corner cases.</p>
<h3 id="conclusion">Conclusion</h3>
<p>The aim of this post is not to make a specific proposal, not yet, but
to try and illustrate further the approach I have in mind for
specifying an executable form of unsafe code guidelines. The key
components are:</p>
<ul>
<li>An augmented interpreter that has meta-variables like <code>WRITES</code> and <code>LAST_WRITES</code>
that track the set of state.
<ul>
<li>This interpreter will also have additional metadata for Rust values, such as a
time-stamp for references.</li>
</ul>
</li>
<li>An augmented compilation that includes assertions that can employ these meta-variables
at well-defined points:
<ul>
<li>before memory accesses and after function calls seem like likely candidates</li>
<li>This compilation might take into account the &ldquo;safety level&rdquo; of a function as well</li>
</ul>
</li>
<li>Using these assertions both to check for undefined behavior and to
justify optimizations.</li>
</ul>
<p>There is certainly plenty of work to be done. For example, we have to
work out just how to handle &ldquo;reborrows&rdquo; (i.e., <code>&amp;*x</code> where <code>x: &amp;T</code>) &ndash;
it seems clear that the resulting reference should get the
&ldquo;time-stamp&rdquo; of the one from which it is borrowed.</p>
<p>Going further, the approach we outlined here isn&rsquo;t quite enough to
handle <code>&amp;mut T</code>, since there we have to reason about the path by which
memory was reached, and not just the state of the memory itself.  I
imagine though that we might be able to handle this by creating a
fresh id for each mutable borrow. When a memory cell is accessed, we
would track the id of those that did the access, and then when an
<code>&amp;mut</code> is used (or the validity of an <code>&amp;mut</code> is asserted, in safe
code) we would check that all publicly accessible memory is either
older than the reference or has the proper ID associated with it. Or
something like that, anyway.</p>
<p>(Also, there is a lot of related work in this area, much of which I am
not that familiar
with. <a href="https://github.com/nikomatsakis/rust-memory-model/issues/33">Robert Krebbers&rsquo;s thesis formalizing the C standard</a> is
certainly relevant (I&rsquo;m happy to say that when I spoke with him at
POPL, he seemed to agree with the overall approach I am advocating
here, though of course we didn&rsquo;t get down to much level of
detail). Projects like <a href="http://compcert.inria.fr">CompCert</a> also leap to mind.)</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/unsafe" term="unsafe" label="Unsafe"/></entry><entry><title type="html">Lowering Rust traits to logic</title><link href="https://smallcultfollowing.com/babysteps/blog/2017/01/26/lowering-rust-traits-to-logic/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2017/01/26/lowering-rust-traits-to-logic/</id><published>2017-01-26T00:00:00+00:00</published><updated>2017-01-26T00:00:00+00:00</updated><content type="html"><![CDATA[<p>Over the last year or two (man, it&rsquo;s scary how time flies), I&rsquo;ve been
doing quite a lot of thinking about Rust&rsquo;s trait system. I&rsquo;ve been
looking for a way to correct a number of flaws and shortcomings in the
current implementation, not the least of which is that it&rsquo;s
performance is not that great. But also, I&rsquo;ve been wanting to get a
relatively clear, normative definition of how the trait system works,
so that we can better judge possible extensions. After a number of
false starts, I think I&rsquo;m finally getting somewhere, so I wanted to
start writing about it.</p>
<p>In this first post, I&rsquo;m just going to try and define a basic mapping
between Rust traits and an underlying logic. In follow-up posts, I&rsquo;ll
start talking about how to apply these ideas into an improved, more
capable trait implementation.</p>
<h3 id="rust-traits-and-logic">Rust traits and logic</h3>
<p>One of the first observations is that the Rust trait system is
basically a kind of logic. As such, we can map our struct, trait, and
impl declarations into logical inference rules. For the most part,
these are basically Horn clauses, though we&rsquo;ll see that to capture the
full richness of Rust &ndash; and in particular to support generic
programming &ndash; we have to go a bit further than standard Horn clauses.</p>
<p>If you&rsquo;ve never heard of Horn clauses, think Prolog. If you&rsquo;ve never
worked with Prolog, shame on you! Ok, I&rsquo;m just kidding, I&rsquo;ve just been
quite obsessed with Prolog lately so now I have to advocate studying
it to everyone (that and Smalltalk &ndash; well, and Rust of course
&#x1f609;). More seriously, if you&rsquo;ve never worked with Prolog, don&rsquo;t
worry, I&rsquo;ll try to explain some as we go. But you may want to keep the
wikipedia page loaded up. =)</p>
<p>Anyway, so, the mapping between traits and logic is pretty straightforward.
Imagine we declare a trait and a few impls, like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="nb">Clone</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="nb">Clone</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="kt">usize</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Clone</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Vec</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="k">where</span><span class="w"> </span><span class="n">T</span>: <span class="nb">Clone</span> <span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>We could map these declarations to some Horn clauses, written in a
Prolog-like notation, as follows:</p>
<pre tabindex="0"><code>Clone(usize).
Clone(Vec&lt;?T&gt;) :- Clone(?T).
</code></pre><p>In Prolog terms, we might say that <code>Clone(Foo)</code> &ndash; where <code>Foo</code> is some
Rust type &ndash; is a <em>predicate</em> that represents the idea that the type
<code>Foo</code> implements <code>Clone</code>. These rules are <em>program clauses</em> that state
the conditions under which that predicate can be proven (i.e.,
considered true). So the first rule just says &ldquo;Clone is implemented
for <code>usize</code>&rdquo;. The next rule says &ldquo;for any type <code>?T</code>, Clone is
implemented for <code>Vec&lt;?T&gt;</code> if clone is implemented for <code>?T</code>&rdquo;. So
e.g. if we wanted to prove that <code>Clone(Vec&lt;Vec&lt;usize&gt;&gt;)</code>, we would do
so by applying the rules recursively:</p>
<ul>
<li><code>Clone(Vec&lt;Vec&lt;usize&gt;&gt;)</code> is provable if:
<ul>
<li><code>Clone(Vec&lt;usize&gt;)</code> is provable if:
<ul>
<li><code>Clone(usize)</code> is provable. (Which is is, so we&rsquo;re all good.)</li>
</ul>
</li>
</ul>
</li>
</ul>
<p>But now suppose we tried to prove that <code>Clone(Vec&lt;Bar&gt;)</code>. This would
fail (after all, I didn&rsquo;t give an impl of <code>Clone</code> for <code>Bar</code>):</p>
<ul>
<li><code>Clone(Vec&lt;Bar&gt;)</code> is provable if:
<ul>
<li><code>Clone(Bar)</code> is provable. (But it is not, as there are no applicable rules.)</li>
</ul>
</li>
</ul>
<p>We can easily extend the example above to cover generic traits with
more than one input type. So imagine the <code>Eq&lt;T&gt;</code> trait, which declares
that <code>Self</code> is equatable with a value of type <code>T</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="nb">Eq</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="nb">Eq</span><span class="o">&lt;</span><span class="kt">usize</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="kt">usize</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Eq</span><span class="o">&lt;</span><span class="n">U</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="nb">Eq</span><span class="o">&lt;</span><span class="nb">Vec</span><span class="o">&lt;</span><span class="n">U</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Vec</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>That could be mapped as follows:</p>
<pre tabindex="0"><code>Eq(usize, usize).
Eq(Vec&lt;?T&gt;, Vec&lt;?U&gt;) :- Eq(?T, ?U).
</code></pre><p>So far so good. However, as we&rsquo;ll see, things get a bit more
interesting when we start adding in notions like associated types,
higher-ranked trait bounds, struct/trait where clauses, coherence,
lifetimes, and so forth. =) I won&rsquo;t get to all of those items in this
post, but hopefully I&rsquo;ll cover them in follow-on posts.</p>
<h3 id="associated-types-and-type-equality">Associated types and type equality</h3>
<p>Let&rsquo;s start with associated types. Let&rsquo;s extend our example trait
to include an associated type or two:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">IntoIter</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">A</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Iterator</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Enumerate</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="kt">usize</span><span class="p">,</span><span class="w"> </span><span class="n">T</span>::<span class="n">Item</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>We would map these into our pseudo-Prolog as follows:</p>
<pre tabindex="0"><code>// This is what we saw before:
Iterator(IntoIter&lt;?A&gt;).
Iterator(Enumerate&lt;?A&gt;) :- Iterator(?A).

// These clauses cover normalizing the associated type.
IteratorItem(IntoIter&lt;?A&gt;, ?A).
IteratorItem(Enumerate&lt;?T&gt;, (usize, &lt;?U as Iterator&gt;::Item)).
                                //  ^^^^^^^^^^^^^^^^^^^^^^
                                //  fully explicit reference to an associated type
</code></pre><p>You can see that we now have two kinds of clauses. <code>Iterator(T)</code> tells
us if <code>Iterator</code> is implemented for <code>T</code>. <code>IteratorItem(T, U)</code> tells us
that <code>T::Item</code> is equivalent to <code>U</code>.</p>
<p>And this brings us to an important point: we need to think about what
<em>equality</em> means in this logic. You can see that I&rsquo;ve been writing
Prolog-like notation but using Rust types; this might have seemed like
a notational convenience (and it is), but it actually masks something
deeper. The notion of equality for a Rust type is sigificantly richer
than Prolog&rsquo;s notion of equality, which is a very simple syntactic
unification.</p>
<p>In particular, imagine that I wanted to combine the <code>Clone</code> rules we
saw earlier with the <code>Iterator</code> definition we just saw, and I wanted
to prove something like <code>Clone(&lt;IntoIter&lt;usize&gt; as Iterator&gt;::Item)</code>.
Intuitively, this should hold, because <code>&lt;IntoIter&lt;usize&gt; as Iterator&gt;::Item</code> is defined to be <code>usize</code>, and we know that
<code>Clone(usize)</code> is provable. But if were using a standard Prolog
engine, it wouldn&rsquo;t know anything about how to handle associated types
when it does proof search, and hence it could not use the clause
<code>Clone(usize)</code> to prove the goal <code>Clone(&lt;IntoIter&lt;usize&gt; as Iterator&gt;::Item)</code>.</p>
<h4 id="one-approach-rewrite-predicates-to-be-based-on-syntactic-equality">One approach: rewrite predicates to be based on syntactic equality</h4>
<p>One approach to solving this problem would be to define all of our
logic rules strictly in terms of syntactic equality. This approach is
sort of appealing because it means we could (in principle, anyway) run
the resulting rules on a standard Prolog engine. Ultimately, though, I
don&rsquo;t think it&rsquo;s the right way to think about things, but it is a
helpful building block for explaining the better way.</p>
<p>If we are using only a syntactic notion of equality, we can&rsquo;t just use
the same variable twice in order to equate types as we have been
doing. Instead, we have to systematically rewrite the rules we&rsquo;ve been
giving to use an auxiliary predicate <code>TypeEqual(T, U)</code>. This predicate
tells us when two Rust types are equal. This is what the rules that
result from the impl of <code>Iterator</code> for <code>IntoIter</code> might look like
written in this style:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="nb">Iterator</span><span class="p">(</span><span class="o">?</span><span class="n">A</span><span class="p">)</span><span class="w"> </span>:<span class="o">-</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">TypeEqual</span><span class="p">(</span><span class="o">?</span><span class="n">A</span><span class="p">,</span><span class="w"> </span><span class="n">IntoIter</span><span class="o">&lt;?</span><span class="n">B</span><span class="o">&gt;</span><span class="p">).</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">IteratorItem</span><span class="p">(</span><span class="o">?</span><span class="n">A</span><span class="p">,</span><span class="w"> </span><span class="o">?</span><span class="n">B</span><span class="p">)</span><span class="w"> </span>:<span class="o">-</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">TypeEqual</span><span class="p">(</span><span class="o">?</span><span class="n">A</span><span class="p">,</span><span class="w"> </span><span class="n">IntoIter</span><span class="o">&lt;?</span><span class="n">B</span><span class="o">&gt;</span><span class="p">).</span><span class="w">
</span></span></span></code></pre></div><p>Looking at the first rule, we say that <code>Iterator(?A)</code> is true for any
type <code>?A</code> that is equal to <code>IntoIter&lt;?B&gt;</code>. You can see that we avoided
directly equating <code>?A</code> and <code>IntoIter&lt;?B&gt;</code>.</p>
<p>The second rule is a bit more interesting: remember, intuitively, we
want to say that that <code>IteratorItem(IntoIter&lt;?B&gt;, ?B)</code> &ndash; that is, we
want to &ldquo;pull out&rdquo; the type argument <code>?X</code> to <code>IntoIter</code> and repeat it.
But since we can&rsquo;t directly equate things, we accept any type <code>?A</code>
that can be found to be equal to <code>IntoIter&lt;?B&gt;</code>.</p>
<p>So let&rsquo;s look at how this <code>TypeEqual</code> thing would work. I&rsquo;ll just show
one way it could be defined, where you have a separate rule for each
kind of type:</p>
<pre tabindex="0"><code>// Rules for syntactic equality. If we JUST had these rules,
// then `TypeEqual` would be equivalent to standard
// Prolog unification.

TypeEqual(usize, usize).

TypeEqual(IntoIter&lt;?A&gt;, IntoIter&lt;?B&gt;) :-
    TypeEqual(?A, ?B).

TypeEqual(&lt;?A as Iterator&gt;::Item, &lt;?B as Iterator&gt;::Item) :-
    TypeEqual(?A, ?B).

// Normalization based rules. This is the rule that lets you
// rewrite an associated type to the type from the impl.

TypeEqual(&lt;?A as Iterator&gt;::Item, ?B) :-
    IteratorItem(?A, ?B).

TypeEqual(?B, &lt;?A as Iterator&gt;::Item) :-
    IteratorItem(?A, ?B).
</code></pre><p>The most interesting rules are the last two, which allow us to
normalize an associated type on either side. Now that we&rsquo;ve done this
rewriting, we can return to our original goal of proving
<code>Clone(&lt;IntoIter&lt;usize&gt; as Iterator&gt;::Item)</code>, and we will find that it
is possible. The key difference is that the program clause <code>Clone(usize)</code>
would now be written <code>Clone(?A) :- TypeEqual(?A, usize)</code>. This means
that we are able to find a (rather convoluted) proof like so:</p>
<ul>
<li><code>Clone(&lt;IntoIter&lt;usize&gt; as Iterator&gt;::Item)</code> is provable if:
<ul>
<li><code>TypeEqual(&lt;IntoIter&lt;usize&gt; as Iterator&gt;::Item, usize)</code> is provable if:
<ul>
<li><code>IteratorItem(IntoIter&lt;usize&gt;, usize)</code> is provable if:
<ul>
<li><code>TypeEqual(IntoIter&lt;usize&gt;, IntoIter&lt;usize&gt;)</code> is provable if:
<ul>
<li><code>TypeEqual(usize, usize)</code> is provable. (Which it is, so we&rsquo;re all good.)</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
</ul>
<p>So, we can see that this approach at least sort of works, but it has a
number of downsides. One problem is that we&rsquo;ve kind of inserted a
&ldquo;compilation&rdquo; step &ndash; the logic rules that we get from a trait/impl
now have to be transformed in this non-obvious way that makes them
look quite different from the source. One of the goals of this logic
translation is to help us understand and evaluate new additions to the
trait system; the further it strays from the Rust source, the less
helpful it will be for that purpose.</p>
<p>The other thing is that the whole reason to use syntactic equality
only was to get something a normal Prolog engine would understand, but
we don&rsquo;t really want to use a regular Prolog engine in the compiler
anyway, for a variety of reasons. And these rules in particular, at
least the way I wrote them here, cause a lot of problems for a regular
Prolog engine, because it introduces ambiguity into the proof
search. You could rewrite them in more complex ways, but then we&rsquo;re
straying even further from the simple logic we were looking for.</p>
<h4 id="another-approach-just-change-what-equality-means">Another approach: just change what equality means!</h4>
<p>Ultimately, a better approach is just to say that equality in our
logic includes a notion of normalization. That is, we can basically
take the same rules for type equality that we defined as <code>TypeEqual(A, B)</code> but move it into the <em>trait-solving engine itself</em> (or, depending
on your POV, into the metatheory of our logic). So now our trait
solver is defined in terms of the original, straight-forward rules
that we&rsquo;ve been writing, but it&rsquo;s understood that when we equate
<code>usize</code> with <code>&lt;IntoIter&lt;usize&gt; as Iterator&gt;::Item</code>, that succeeds only
if we can recursively prove the predicate
<code>IteratorItem(IntoIter&lt;usize&gt;, usize)</code>. This ultimately is the
approach that I&rsquo;ve taken in my prototype: the trait solver itself has
a built-in notion of normalization and it always uses it when it is
doing unification. (The scheme I have implemented is what we have
sometimes called &ldquo;lazy normalization&rdquo; in the past.)</p>
<p>It may seem like this was always the obvious route to take. And I
suppose in a way it is. But part of why I resisted it for some time
was that I was searching out what is the <em>simplest</em> and most minimal
way to define the trait solver; so every notion that we can trivially
&ldquo;export&rdquo; into the logic rules is a win in that respect. But equality
is a bridge too far.</p>
<h4 id="an-aside-call-for-citations">An aside: call for citations</h4>
<p>As an aside, I&rsquo;d be curious to know if anyone has suggestions for
related work around this area of &ldquo;customizable equality&rdquo;. In
particular, I&rsquo;m not aware of logic languages that have to prove goals
to prove equality (though I got some leads at POPL last week that I
have yet to track down).</p>
<p>Along a similar vein, I&rsquo;ve also been interested in strengthening the
notion of equality even further, so that we go beyond mere
normalization and include the ability to have arbitrary equality
constraints (e.g., `fn foo<A>() where A = i32``). The key to doing
this is solving a problem called &ldquo;congruence closure&rdquo; &ndash; and indeed
there exist good algorithms for doing that, and I&rsquo;ve implemented
<a href="http://www.alice.virginia.edu/~weimer/2011-6610/reading/nelson-oppen-congruence.pdf">one of them</a> in the <a href="http://github.com/nikomatsakis/ena">ena</a> crate that I&rsquo;m using to do
unification. However, combining this algorithm with the proof search
rules for trait solving, particularly with inference and higher-ranked
trait bounds, is non-trivial, and I haven&rsquo;t found a satisfying
solution to it yet. I would assume that more full-featured theorem
provers like Coq, Lean, Isabelle and so forth have some clever tricks
for tackling these sorts of problems, but I haven&rsquo;t graduated to
reading into those techniques yet, so citations here would be nice too
(though it may be some time before I follow up).</p>
<h3 id="type-checking-normal-functions">Type-checking normal functions</h3>
<p>OK, now that we have defined some logical rules that are able to
express when traits are implemented and to handle associated types,
let&rsquo;s turn our focus a bit towards <em>type-checking</em>. Type-checking is
interesting because it is what gives us the goals that we need to
prove. That is, everything we&rsquo;ve seen so far has been about how we
derive the rules by which we can prove goals from the traits and impls
in the program; but we are also interesting in how derive the goals
that we need to prove, and those come from type-checking.</p>
<p>Consider type-checking the function <code>foo()</code> here:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">bar</span>::<span class="o">&lt;</span><span class="kt">usize</span><span class="o">&gt;</span><span class="p">()</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">bar</span><span class="o">&lt;</span><span class="n">U</span>: <span class="nb">Eq</span><span class="o">&gt;</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This function is very simple, of course: all it does is to call
<code>bar::&lt;usize&gt;()</code>. Now, looking at the definition of <code>bar()</code>, we can see
that it has one where-clause <code>U: Eq</code>. So, that means that <code>foo()</code> will
have to prove that <code>usize: Eq</code> in order to show that it can call <code>bar()</code>
with <code>usize</code> as the type argument.</p>
<p>If we wanted, we could write a prolog predicate that defines the
conditions under which <code>bar()</code> can be called. We&rsquo;ll say that those
conditions are called being &ldquo;well-formed&rdquo;:</p>
<pre tabindex="0"><code>barWellFormed(?U) :- Eq(?U).
</code></pre><p>Then we can say that <code>foo()</code> type-checks if the reference to
<code>bar::&lt;usize&gt;</code> (that is, <code>bar()</code> applied to the type <code>usize</code>) is
well-formed:</p>
<pre tabindex="0"><code>fooTypeChecks :- barWellFormed(usize).
</code></pre><p>If we try to prove the goal <code>fooTypeChecks</code>, it will succeed:</p>
<ul>
<li><code>fooTypeChecks</code> is provable if:
<ul>
<li><code>barWellFormed(usize)</code>, which is provable if:
<ul>
<li><code>Eq(usize)</code>, which is provable because of an impl.</li>
</ul>
</li>
</ul>
</li>
</ul>
<p>Ok, so far so good. Let&rsquo;s move on to type-checking a more complex function.</p>
<h3 id="type-checking-generic-functions-beyond-horn-clauses">Type-checking generic functions: beyond Horn clauses</h3>
<p>In the last section, we used standard Prolog horn-clauses (augmented with Rust&rsquo;s
notion of type equality) to type-check some simple Rust functions. But that only
works when we are type-checking non-generic functions. If we want to type-check
a generic function, it turns out we need a stronger notion of goal than Prolog
can be provide. To see what I&rsquo;m talking about, let&rsquo;s revamp our previous
example to make <code>foo</code> generic:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Eq</span><span class="o">&gt;</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">bar</span>::<span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">()</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">bar</span><span class="o">&lt;</span><span class="n">U</span>: <span class="nb">Eq</span><span class="o">&gt;</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>To type-check the body of <code>foo</code>, we need to be able to hold the type
<code>T</code> &ldquo;abstract&rdquo;.  That is, we need to check that the body of <code>foo</code> is
type-safe <em>for all types <code>T</code></em>, not just for some specific type. We might express
this like so:</p>
<pre tabindex="0"><code>fooTypeChecks :-
  // for all types T...
  forall&lt;T&gt; {
    // ...if we assume that Eq(T) is provable...
    if (Eq(T)) {
      // ...then we can prove that `barWellFormed(T)` holds.
      barWellFormed(T)
    }
  }.
</code></pre><p>This notation I&rsquo;m using here is the notation I&rsquo;ve been using in my
prototype implementation; it&rsquo;s similar to standard mathematical
notation but a bit Rustified. Anyway, the problem is that standard
Horn clauses don&rsquo;t allow universal quantification (<code>forall</code>) or
implication (<code>if</code>) in goals (though many Prolog engines do support
them, as an extension). For this reason, we need to accept something
called &ldquo;first-order hereditary harrop&rdquo; (FOHH) clauses &ndash; this long
name basically means &ldquo;standard Horn clauses with <code>forall</code> and <code>if</code> in
the body&rdquo;. But it&rsquo;s nice to know the proper name, because there is a
lot of work describing how to efficiently handle FOHH clauses. I was
particularly influenced by Gopalan Nadathur&rsquo;s excellent
<a href="http://dl.acm.org/citation.cfm?id=868380">&ldquo;A Proof Procedure for the Logic of Hereditary Harrop Formulas&rdquo;</a>.</p>
<p>Anyway, I won&rsquo;t go into the details in this post, but suffice to say
that supporting FOHH is not really all that hard. And once we are able
to do that, we can easily describe the type-checking rule for generic
functions like <code>foo</code> in our logic.</p>
<h3 id="conclusion-and-future-vision">Conclusion and future vision</h3>
<p>So, I&rsquo;m pretty excited about this work. I&rsquo;ll be posting plenty of
follow-up posts that dig into the details in the days to come, but I
want to take a moment in this post to lay out the long-term vision
that I&rsquo;m shooting for in a bit more depth.</p>
<p>Ultimately, what I am trying to develop is a kind of &ldquo;middle layer&rdquo;
for the Rust type system. That is, we can think of modeling Rust
semantics in three layers:</p>
<ul>
<li>Pure Rust syntax (the traits, impls, etc that you type)</li>
<li>Inference rules (the &ldquo;lowered&rdquo; form I&rsquo;ve been talking about in this
post)</li>
<li>Proof search engine (the trait solver in the compiler)</li>
</ul>
<p>Essentially, what makes the current compiler&rsquo;s trait solver complex is
that it omits the middle layer. This is exactly analogous to the way
that trans in the old compiler was complex because it tried to map
directly from the AST to LLVM&rsquo;s IR, instead of having an intermediate
step (what we now call MIR).</p>
<p>The goal of this work is then to puzzle out what piece of each
structure belongs at each layer such that each individual layer
remains quite simple, but the system still does what we expect. We saw
a bit of that in this post, where I sketched out why it is best to
include type equality in the layer of the &ldquo;proof search engine&rdquo; &ndash;
i.e., as part of how the inference rules are themselves interpreted &ndash;
rather than modeling it in the inference rules themselves. I think
I&rsquo;ve made a lot of progress here, as I&rsquo;ll try to lay out in follow-up
posts, but in some areas &ndash; particularly coherence! &ndash; I&rsquo;m not yet
sure of the right division.</p>
<p>For the moment, I&rsquo;ve been implementing things in Rust. You can view my
prototype solver in the <a href="http://github.com/nikomatsakis/chalk">chalk</a> repository. The code consists of a
<a href="https://github.com/nikomatsakis/chalk/blob/master/chalk-rust-parse/src/parser.lalrpop">parser</a> for a <a href="https://github.com/nikomatsakis/chalk/blob/master/chalk-rust-parse/src/ast.rs">subset of Rust syntax</a>. It then <a href="https://github.com/nikomatsakis/chalk/blob/master/chalk-rust/src/lower/mod.rs">lowers this syntax</a>
into an <a href="https://github.com/nikomatsakis/chalk/blob/master/chalk-rust/src/ir/mod.rs">internal IR</a> that maps fairly cleanly to the things I&rsquo;ve been
showing you here. What I would like is for chalk to become the
<strong>normative implementation</strong> of the trait system: that is, chalk
basically would describe how the trait system is <em>supposed</em> to
behave. To this end, we would prioritize clean and simple code over
efficiency.</p>
<p>Once we have a normative implementation, that means that we could
evaluate the complexity of RFCs that aim to extend the trait system by
implementing them in the normative codebase <em>first</em>, so that we can
uncover any complications. As a proof of concept of that approach,
I&rsquo;ve implemented withoutboat&rsquo;s
<a href="https://github.com/rust-lang/rfcs/pull/1598">associated type constructor RFC</a>, which I will describe in a
future post (preview: it&rsquo;s very easy to do and works out beautifully;
in fact, it doesn&rsquo;t add <em>anything at all</em> to the logic, once you
consider the fact that we already support generic methods).</p>
<p>Separately, I&rsquo;d like to rewrite the trait system in the compiler to
use the same overall strategy as chalk is pioneering, but with a
more optimized implementation. I will say more in follow-up posts, but
I think that this strategy has a good chance of significantly
improving compile-times: it is much more amenable to caching and does
far less redundant work than the current codebase. Moreover, this
approach just seems much cleaner and more capable overall, so I would
expect we would be able to to close out a number of open bugs related
to normalization as well as completing various advanced features, like
specialization. Win-win overall.</p>
<p>Once we have two implementations, I would like to check them against
one another. Basically the compiler would have a special mode in which
it forwards every goal that it tries to prove over to the normative
implementation, as well as solving the goal itself using the efficient
implementation. These two should yield the same results: if they fail
to do so, that&rsquo;s a bug somewhere (probably the compiler, but you never
know).</p>
<p>Finally, I think it should be possible to extract a more formal
description of the trait system from chalk, along the lines of what
I&rsquo;ve been sketching here. This would allow us to prove various
properties about the trait system as well as our proof search
algorithm (e.g., it&rsquo;d be nice to prove that the proof search strategy
we are using is sound and complete &ndash; meaning that it always finds
valid proofs, and that if there is a proof to be found, it will find
it).</p>
<p>This is way too much work to do on my own of course. I intend to focus
my efforts primarily on the compiler implementation, because I would
love to know if indeed I am correct and this is a massive improvement
to compilation time. But along the way I also plan to write-up as many
mentoring bugs as I can, both in chalk and in the compiler itself. I
think this would be a really fun way to get into rustc hacking, and we
can always use more people who know their way around the trait system!</p>
<h3 id="comments">Comments?</h3>
<p>I started a <a href="https://internals.rust-lang.org/t/blog-series-lowering-rust-traits-to-logic/4673">thread on internals</a> to discuss this post and other
(forthcoming) posts in the series.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/traits" term="traits" label="Traits"/><category scheme="https://smallcultfollowing.com/babysteps/categories/pl" term="pl" label="PL"/><category scheme="https://smallcultfollowing.com/babysteps/categories/chalk" term="chalk" label="Chalk"/></entry><entry><title type="html">Assigning blame to unsafe code</title><link href="https://smallcultfollowing.com/babysteps/blog/2017/01/22/assigning-blame-to-unsafe-code/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2017/01/22/assigning-blame-to-unsafe-code/</id><published>2017-01-22T00:00:00+00:00</published><updated>2017-01-22T00:00:00+00:00</updated><content type="html"><![CDATA[<p>While I was at POPL the last few days, I was reminded of an idea
regarding how to bring more struture to the unsafe code guidelines
process that I&rsquo;ve been kicking around lately, but which I have yet to
write about publicly. The idea is fresh on my mind because while at
POPL I realized that there is an interesting opportunity to leverage
the &ldquo;blame&rdquo; calculation techniques from gradual typing research. But
before I get to blame, let me back up and give some context.</p>
<h3 id="the-guidelines-should-be-executable">The guidelines should be executable</h3>
<p>I&rsquo;ve been thinking for some time that, whatever guidelines we choose,
we need to adopt the principle that they should be <strong>automatically
testable</strong>. By this I mean that we should be able to compile your
program in a special mode (&ldquo;sanitizer mode&rdquo;) which adds in extra
assertions and checks. These checks would dynamically monitor what
your program does to see if it invokes undefined behavior: if they
detect UB, then they will abort your program with an error message.</p>
<p>Plenty of sanitizers or sanitizer-like things exist for C, of course.
My personal favorite is <a href="http://valgrind.org/">valgrind</a>, but there are a number of
<a href="https://github.com/google/sanitizers">other examples</a> (the
<a href="https://golang.org/doc/articles/race_detector.html">data-race detector for Go</a>
also falls in a similar category). However, as far as I know, none of
the C sanitizers is able to detect the full range of undefined
behavior. Partly this is because C UB includes untestable (and, in my
opinion, overly aggressive) rules like &ldquo;every loop should do I/O or
terminate&rdquo;. I think we should strive for a <strong>sound and complete</strong>
sanitizer, meaning that we guarantee that if there is undefined
behavior, we will find it, and that we have no false positives.  We&rsquo;ll
see if that&rsquo;s possible. =)</p>
<p>The really cool thing about having the rules be executable (and
hopefully <em>efficiently</em> executable) is that, in the (paraphrased)
words of <a href="http://www.cs.utah.edu/~regehr/">John Regehr</a>, it changes
the problem of verifying safety from a formal one into a matter of
test coverage, and the latter is much better understood. My ultimate
goal is that, if you are the developer of an unsafe library, all you
have to do is to run <code>cargo test --unsafe</code> (or some such thing), and
all of the normal tests of your library will run but in a special
sanitizer mode where any undefined behavior will be caught and flagged
for you.</p>
<p>But I think there is one other important side-effect. I have been (and
remain) very concerned about the problem of programmers not
understanding (or even being aware of) the rules regarding correct
unsafe code. This is why I originally wanted a system like the Tootsie
Pop rules, where programmers have to learn as few things as possible.
But having an easy and effective way of testing for violations changes
the calculus here dramatically: <strong>I think we can likely get away with
much more aggressive rules if we can test for violations</strong>. To play on
John Regehr&rsquo;s words, this changes the problem from being one of having
to learn a bunch of rules to having to interpret error messages. <strong>But
for this to work well, of course, the error messages have to be
good.</strong> And that&rsquo;s where this idea comes in.</p>
<h3 id="proof-of-concept-miri">Proof of concept: miri</h3>
<p>As it happens, there is an existing project that is already doing a
limited form of the kind of checks I have in mind: <a href="https://github.com/solson/miri">miri</a>, the MIR
interpreter created by <a href="https://github.com/solson/">Scott Olson</a> and now with
<a href="https://github.com/solson/miri/graphs/contributors">significant contributions</a> by <a href="https://github.com/oli-obk">Oliver Schneider</a>. If you haven&rsquo;t seen
or tried miri, I encourage you to do so. It is very cool and
surprisingly capable &ndash; in particular, miri can not only execute safe
Rust, but also <strong>unsafe</strong> Rust (e.g., it is able to interpret the
definition of <code>Vec</code>).</p>
<p>The way it does this is to simulate the machine at a reasonably
low-level. So, for example, when you allocate memory, it stores that
as a kind of blob of bytes of a certain size. But it doesn&rsquo;t <em>only</em>
store bytes; rather, it tracks additional metadata about what has been
stored into various spots. For example, it knows whether memory has
been initialized or not, and it knows which bits are pointers (which
are stored opaquely, not with an actual address). This allows is to
interpret a lot of unsafe code, but it also allows it to detect
various kinds of errors.</p>
<h3 id="an-example">An example</h3>
<p>Let&rsquo;s start with a simple example of some bogus unsafe code.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">b</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Box</span>::<span class="n">new</span><span class="p">(</span><span class="mi">22</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">innocent_looking_fn</span><span class="p">(</span><span class="o">&amp;</span><span class="n">b</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">*</span><span class="n">b</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">innocent_looking_fn</span><span class="p">(</span><span class="n">b</span>: <span class="kp">&amp;</span><span class="nb">Box</span><span class="o">&lt;</span><span class="kt">usize</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// This wicked little bit of code will take a borrowed
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// `Box` and free it.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">p</span>: <span class="o">*</span><span class="k">const</span><span class="w"> </span><span class="kt">usize</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;**</span><span class="n">b</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">q</span>: <span class="nb">Box</span><span class="o">&lt;</span><span class="kt">usize</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Box</span>::<span class="n">from_raw</span><span class="p">(</span><span class="n">p</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="kt">usize</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The problem here is that this &ldquo;innocent looking function&rdquo; claims to
borrow the box <code>b</code> but it actually frees it. So now when <code>main()</code>
comes along to execute <code>*b += 1</code>, the box <code>b</code> has been freed. This
situation is often called a &ldquo;dangling pointer&rdquo; in C land. We might expect
then that when you execute this program, something dramatic will happen,
but that is not (necessarily) the case:</p>
<pre tabindex="0"><code>&gt; rustc tests/dealloc.rs
&gt; ./dealloc
</code></pre><p>As you can see, I got no error or any other indication that something
went awry. This is because, internally, freeing the box just throws
its address on a list for later re-use. Therefore when I later make
use of that address, it&rsquo;s entirely possible that the memory is still
sitting there, waiting for me to use it, even if I&rsquo;m not supposed to.
This is part of what makes tracking down a &ldquo;use after free&rdquo; bug
incredibly frustrating: oftentimes, nothing goes wrong! (Until it
does.) It&rsquo;s also why we need some kind of <strong>sanitizer</strong> mode that will
do additional checks beyond what really happens at runtime.</p>
<h3 id="detecting-errors-with-miri">Detecting errors with miri</h3>
<p>But what happens when I run this through miri?</p>
<pre tabindex="0"><code>&gt; cargo run tests/dealloc.rs
    Finished dev [unoptimized + debuginfo] target(s) in 0.2 secs
     Running `target/debug/miri tests/dealloc.rs`
error: dangling pointer was dereferenced
 --&gt; tests/dealloc.rs:8:5
  |
8 |     *b += 1;
  |     ^^^^^^^
  |
note: inside call to main
 --&gt; tests/dealloc.rs:5:1
  |
5 |   fn main() {
  |  _^ starting here...
6 | |     let mut b = Box::new(22);
7 | |     evil(&amp;b);
8 | |     *b += 1;
9 | | }
  | |_^ ...ending here

error: aborting due to previous error
</code></pre><p>(First, before going further, let&rsquo;s just take a minute to be impressed
by the fact that miri bothered to give us a nice stack trace here. I
had heard good things about miri, but before I started poking at it
for this blog post, I expected something a lot less polished. I&rsquo;m
impressed.)</p>
<p>You can see that, unlike the real computer, miri detected that <code>*b</code>
was freed when we tried to access it. It was able to do this because
when miri is interpreting your code, it does so with respect to a more
abstract model of how a computer works. In particular, when memory is
freed in miri, miri remembers that the address was freed, and if there
is a later attempt to access it, an error is thrown. (This is very
similar to what tools like <a href="http://valgrind.org/">valgrind</a> and <a href="http://elinux.org/Electric_Fence">electric fence</a> do as well.)</p>
<p>So even just using miri out of the box, we see that we are starting to
get a certain amount of sanitizer rules. Whatever the unsafe code
guidelines turn out to be, one can be sure that they will declare it
illegal to access freed memory. As this example demonstrates, running
your code through miri could help you detect a violation.</p>
<h3 id="blame">Blame</h3>
<p>This example also illustrates another interesting point about a
sanitizer tool. The point where the error is <strong>detected</strong> is not
necessarily telling you which bit of code is <strong>at fault</strong>. In this
case, the error occurs in the safe code, but it seems clear that the
fault lies in the unsafe block in <code>innocent_looking_fn()</code>. That
function was supposed to present a safe interface, but it failed to do
so. Unfortunately, for us to figure that out, we have to trawl through
the code, executing backwards and trying to figure out how this freed
pointer got into the variable <code>b</code>. Speaking as someone who has spent
years of his life doing exactly that, I can tell you it is not fun.
Anything we can do to get a more precise notion of what code is at
fault would be tremendously helpful.</p>
<p>It turns out that there is a large body of academic work that I think
could be quite helpful here. For some time, people have been exploring
<a href="https://en.wikipedia.org/wiki/Gradual_typing"><strong>gradual typing</strong> systems</a>. This
is usually aimed at the software development process: people want to
be able to start out with a dynamically typed bit of software, and
then add types gradually. But it turns out when you do this, you have
a similar problem: your statically typed code is guaranteed to be
internally consistent, but the dynamically typed code might well feed
it values of the wrong types. To address this, <strong>blame systems</strong>
attempt to track where you crossed between the static and dynamic
typing worlds so that, when an error occurs, the system can tell you
which bit of code is at fault.</p>
<p><strong>UPDATE:</strong> It turns out that I got the history of blame wrong. While
blame is used in gradual typing work, it actually originates in the
more general setting of contract enforcement, specifically with
<a href="https://www.eecs.northwestern.edu/~robby/pubs/papers/behavioral-software-contracts.pdf">Robby Findler&rsquo;s thesis on Behavioral Software Contracts</a>. That&rsquo;s
what I get for writing on the plane without internet. =)</p>
<p>Traditionally this blame tracking has been done using proxies and
other dynamic mechanisms, particularly around closures. For example,
Jesse Tov&rsquo;s <a href="http://users.eecs.northwestern.edu/~jesse/pubs/dissertation/">Alms language</a> allocated stateful proxies to allow
for owned types to flow into a language that didn&rsquo;t understand
ownership (this is sort of roughly analogous to dynamically wrapping a
value in a <code>RefCell</code>). Unfortunately, introducing proxies doesn&rsquo;t seem
like it would really work so well for a &ldquo;no runtime&rdquo; language like
Rust. We could probably get away with it in miri, but it would never
scale to running arbitrary C code.</p>
<p>Interestingly, at this year&rsquo;s POPL, I saw a paper that seemed to
present a solution to this problem. In
<a href="http://dl.acm.org/citation.cfm?id=3009849"><em>Big types in little runtime</em></a>, Michael Vitousek, Cameron
Swords (ex-Rust intern!), and Jeremy Siek describe a system for doing
gradual typing in Python that works even without modifying the Python
runtime &ndash; this rules out proxies, because the runtime would have to
know about them. Instead, the statically typed code keeps a log &ldquo;on
the side&rdquo; which tracks transitions to and from the unsafe code and
other important events. When a fault occurs, they can read this log
and reconstruct which bit of code is at fault. This seems eminently
applicable to this setting: we have control over the <em>safe Rust</em> code
(which we are compiling in a special mode), but we don&rsquo;t have to
modify the unsafe code (which might be in Rust, but might also be in
C). Exciting!</p>
<h3 id="conclusion">Conclusion</h3>
<p>This post has two purposes, in a way. First, I want to advocate for
the idea that we should define the unsafe code guidelines in an
<em>executable</em> way. Specifically, I think we should specify predicates
that must hold at various points in the execution. In this post we saw
a simple example: when you dereference a pointer, it must point to
memory that has been allocated and not yet freed. (Note that this
particular rule only applies to the moment at which the pointer is
dereferenced; at other times, the pointer can have any value you want,
though it may wind up being restricted by other rules.) It&rsquo;s much more
interesting to think about assertions that could be used to enforce
Rust&rsquo;s aliasing rules, but that&rsquo;s a good topic for another post.</p>
<p>Probably the best way for us to do this is to start out with a minimal
&ldquo;operational semantics&rdquo; for a representative subset of MIR (bascally a
mathematical description of what MIR does) and then specify rules by
adding side-clauses and conditions into that semantics. I have been
talking to some people who might be interested in doing that, so I
hope to see progress here.</p>
<p>That said, it may be that we can instead do this exploratory work by
editing miri. The codebase seems pretty clean and capable, and a lot of the
base work is done.</p>
<p>In the long term, I expect we will want to instead target a platform
like <a href="http://valgrind.org/">valgrind</a>, which would allow us to apply these rules even around
to unsafe C code. I&rsquo;m not sure if that&rsquo;s really feasible, but it seems
like the ideal.</p>
<p>The second purpose of the post is to note the connection with gradual
typing and the opportunity to apply blame research to the problem. I
am very excited about this, because I&rsquo;ve always felt that guidelines
based simply on undefined behavior were going to be difficult for
people to use, since errors are are often detected in code that is
quite disconnected from the origin of the problem.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/unsafe" term="unsafe" label="Unsafe"/></entry><entry><title type="html">Parallel iterators, part 3: Consumers</title><link href="https://smallcultfollowing.com/babysteps/blog/2016/11/14/parallel-iterators-part-3-consumers/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2016/11/14/parallel-iterators-part-3-consumers/</id><published>2016-11-14T00:00:00+00:00</published><updated>2016-11-14T00:00:00+00:00</updated><content type="html"><![CDATA[<p>This post is the (long awaited, or at least long promised) third post
in my series on Rayon&rsquo;s parallel iterators. The previous two posts
were some time ago, but I&rsquo;ve been feeling inspired to push more on
Rayon lately, and I remembered that I had never finished this blog
post series.</p>
<p>Here is a list of the other posts in the series. If you haven&rsquo;t read
them, or don&rsquo;t remember them, you will want to do so before reading
this one:</p>
<ol>
<li>The first post, <a href="https://smallcultfollowing.com/babysteps/
/blog/2016/02/19/parallel-iterators-part-1-foundations/">&ldquo;Foundations&rdquo;</a>, explains how sequential
iterators work. It is also a nice introduction to some of the key
techniques for zero-cost abstraction.</li>
<li>The second post, <a href="https://smallcultfollowing.com/babysteps/
/blog/2016/02/25/parallel-iterators-part-2-producers/">&ldquo;Producers&rdquo;</a>, then shows how we can adapt
the sequential iterator approach to permit parallel iteration.  It
focuses on the concept of <strong>parallel producers</strong>: these are
basically splittable iterators. They give you the ability to say
&ldquo;break this producer into two producer, one of which produces the
left half, and one the right half&rdquo;. You can then process those two
halves in parallel. When the number of work items gets small
enough, you can convert a producer into a sequential iterator and
consume it sequentially.</li>
</ol>
<p>This third post will introduce <strong>parallel consumers</strong>. Parallel
consumers are the dual to a parallel producer: they abstract out the
parallel algorithm. We&rsquo;ll use this to extend beyond the <code>sum()</code> action
and cover how we can implementation a <code>collect()</code> operation that
efficiently builds up a big vector of data.</p>
<p>(Note: originally, I had intended this third post to cover how
combinators like <code>filter()</code> and <code>flat_map()</code> work. These combinators
are special because they produce a variable number of
elements. However, in writing this post, it became clear that it would
be better to first introduce consumers, and then cover how to extend
them to support <code>filter()</code> and <code>flat_map()</code>.)</p>
<h3 id="motivating-example">Motivating example</h3>
<p>In this post, we&rsquo;ll cover two examples. The first will be the running
example from the previous two posts, a dot-product iterator chain:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">vec1</span><span class="p">.</span><span class="n">par_iter</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">zip</span><span class="p">(</span><span class="n">vec2</span><span class="p">.</span><span class="n">par_iter</span><span class="p">())</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">map</span><span class="p">(</span><span class="o">|</span><span class="p">(</span><span class="n">i</span><span class="p">,</span><span class="w"> </span><span class="n">j</span><span class="p">)</span><span class="o">|</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">j</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">sum</span><span class="p">()</span><span class="w">
</span></span></span></code></pre></div><p>After that, we&rsquo;ll look at a slight variation, where instead of summing
up the partial products, we collect them into a vector:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">c</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">_</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">vec1</span><span class="p">.</span><span class="n">par_iter</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="p">.</span><span class="n">zip</span><span class="p">(</span><span class="n">vec2</span><span class="p">.</span><span class="n">par_iter</span><span class="p">())</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="p">.</span><span class="n">map</span><span class="p">(</span><span class="o">|</span><span class="p">(</span><span class="n">i</span><span class="p">,</span><span class="w"> </span><span class="n">j</span><span class="p">)</span><span class="o">|</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">j</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="p">.</span><span class="n">collect</span><span class="p">();</span><span class="w"> </span><span class="c1">// &lt;-- only thing different
</span></span></span></code></pre></div><h3 id="review-parallel-producers">Review: parallel producers</h3>
<p>In the <a href="https://smallcultfollowing.com/babysteps/
/blog/2016/02/25/parallel-iterators-part-2-producers/">second post</a>, I introduced the basics of how parallel
iterators work. The key idea was the <code>Producer</code> trait, which is a
variant on iterators that is amenable to &ldquo;divide-and-conquer&rdquo;
parallelization:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Producer</span>: <span class="nb">IntoIterator</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// Divide into two producers, one of which produces data
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// with indices `0..index` and the other with indices `index..`.
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">split_at</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">index</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="p">(</span><span class="bp">Self</span><span class="p">,</span><span class="w"> </span><span class="bp">Self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Unlike normal iterators, which only support extracting one element at
a time, a parallel producer can be split into two &ndash; and this can
happen again and again. At some point, when you think you&rsquo;ve got small
enough pieces, you can convert it into an iterator (you see it extends
<code>IntoIterator</code>) and work sequentially.</p>
<p>To see this in action, let&rsquo;s revisit the <code>sum_producer()</code> function
that <a href="https://smallcultfollowing.com/babysteps/
/blog/2016/02/25/parallel-iterators-part-2-producers/#implementing-sum-with-producers">I covered in my previous blog post</a>;
<code>sum_producer()</code> basically executes the <code>sum()</code> operation, but
extracting data from a producer. Later on in the post, we&rsquo;re going to
see how consumers abstract out the <em>sum</em> part of this code, leaving us
with a generic function that can be used to execute all sorts of
parallel iterator chains.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">sum_producer</span><span class="o">&lt;</span><span class="n">P</span><span class="o">&gt;</span><span class="p">(</span><span class="k">mut</span><span class="w"> </span><span class="n">producer</span>: <span class="nc">P</span><span class="p">,</span><span class="w"> </span><span class="n">len</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">i32</span>
</span></span><span class="line"><span class="cl">    <span class="nc">where</span><span class="w"> </span><span class="n">P</span>: <span class="nc">Producer</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">=</span><span class="kt">i32</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">if</span><span class="w"> </span><span class="n">len</span><span class="w"> </span><span class="o">&gt;</span><span class="w"> </span><span class="no">THRESHOLD</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Input too large: divide it up
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">mid</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">len</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="mi">2</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="p">(</span><span class="n">left_producer</span><span class="p">,</span><span class="w"> </span><span class="n">right_producer</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">producer</span><span class="p">.</span><span class="n">split_at</span><span class="p">(</span><span class="n">mid</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="p">(</span><span class="n">left_sum</span><span class="p">,</span><span class="w"> </span><span class="n">right_sum</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">rayon</span>::<span class="n">join</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="o">||</span><span class="w"> </span><span class="n">sum_producer</span><span class="p">(</span><span class="n">left_producer</span><span class="p">,</span><span class="w"> </span><span class="n">mid</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="o">||</span><span class="w"> </span><span class="n">sum_producer</span><span class="p">(</span><span class="n">right_producer</span><span class="p">,</span><span class="w"> </span><span class="n">len</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">mid</span><span class="p">));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">left_sum</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">right_sum</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Input too small: sum sequentially
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">sum</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mf">0.0</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="w"> </span><span class="n">value</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="n">producer</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="n">sum</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="n">value</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">sum</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="enter-parallel-consumers">Enter parallel consumers</h3>
<p>What we would like to do in this post is to try and make an abstract
version of this <code>sum_producer()</code> function, one that can do all kinds
of parallel operations, rather than just summing up a list of numbers.
The way we do this is by introducing the notion of a <strong>parallel
consumer</strong>. Consumers represent the &ldquo;action&rdquo; at the end of the
iterator; they define what to do with each item that gets produced:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">vec1</span><span class="p">.</span><span class="n">par_iter</span><span class="p">()</span><span class="w">           </span><span class="c1">// defines initial producer...
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">zip</span><span class="p">(</span><span class="n">vec2</span><span class="p">.</span><span class="n">par_iter</span><span class="p">())</span><span class="w"> </span><span class="c1">// ...wraps to make a new producer...
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">map</span><span class="p">(</span><span class="o">|</span><span class="p">(</span><span class="n">i</span><span class="p">,</span><span class="w"> </span><span class="n">j</span><span class="p">)</span><span class="o">|</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">j</span><span class="p">)</span><span class="w">  </span><span class="c1">// ...wraps again...
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">sum</span><span class="p">()</span><span class="w">                </span><span class="c1">// ...defines the consumer
</span></span></span></code></pre></div><p>The <code>Consumer</code> trait looks like this. You can see it has a few more
moving parts than producers.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// `Item` is the type of value that the producer will feed us.
</span></span></span><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">trait</span><span class="w"> </span><span class="n">Consumer</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&gt;</span>: <span class="nb">Send</span> <span class="o">+</span><span class="w"> </span><span class="nb">Sized</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// Type of value that consumer produces at the end.
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">type</span> <span class="nb">Result</span>: <span class="nb">Send</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// Splits the consumer into two consumers at `index`.
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// Also returns a *reducer* for combining their results afterwards.
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">type</span> <span class="nc">Reducer</span>: <span class="nc">Reducer</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="nb">Result</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">split_at</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">index</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="p">(</span><span class="bp">Self</span><span class="p">,</span><span class="w"> </span><span class="bp">Self</span><span class="p">,</span><span class="w"> </span><span class="bp">Self</span>::<span class="n">Reducer</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// Convert the consumer into a *folder*, which can sequentially
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// process items one by one and produce a result.
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">type</span> <span class="nc">Folder</span>: <span class="nc">Folder</span><span class="o">&lt;</span><span class="n">Item</span><span class="p">,</span><span class="w"> </span><span class="nb">Result</span><span class="o">=</span><span class="bp">Self</span>::<span class="nb">Result</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">into_folder</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Folder</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The basic workflow for driving a producer/consumer pair is as follows:</p>
<ol>
<li>You start out with one producer/consumer pair; using <code>split_at()</code>,
these can be split into two pairs and then those pairs can be
processed in parallel. Splitting a consumer also returns something
called a <em>reducer</em>, we&rsquo;ll get to its role in a bit.</li>
<li>At some point, to process sequentially, you convert the producer
into an iterator using <code>into_iter()</code> and convert the consumer into
a <em>folder</em> using <code>into_folder()</code>. You then draw items from the
producer and feed them to the folder. At the end, the folder
produces a result (of type <code>C::Result</code>, where <code>C</code> is the consumer
type) and this is returned.</li>
<li>As we walk back up the stack, at each point where we had split the
consumer into two, we now have two results, which must be combined
using the <em>reducer</em> (also returned by <code>split_at()</code>).</li>
</ol>
<p>Let&rsquo;s take a closer look at the folder and reducer. Folders are
defined by <a href="https://github.com/nikomatsakis/rayon/blob/a0047facd2df584c771775bd8812c02f915e577c//src/par_iter/internal.rs#L60-L72">the <code>Folder</code> trait</a>, a simplified version of which
is shown below. They can be fed items one by one and, at the end,
produce some kind of result:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">trait</span><span class="w"> </span><span class="n">Folder</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">type</span> <span class="nb">Result</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="sd">/// Consume next item and return new sequential state.
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">consume</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">item</span>: <span class="nc">Item</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="sd">/// Finish consuming items, produce final result.
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">complete</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="nb">Result</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Of course, when we split, we will have two halves, both of which will
produce a result. Thus when a consumer splits, it also returns a
<em>reducer</em> that knows how to combine those results back
again. <a href="https://github.com/nikomatsakis/rayon/blob/a0047facd2df584c771775bd8812c02f915e577c//src/par_iter/internal.rs#L74-L78">The <code>Reducer</code> trait</a> is shown below. It just consists
of a single method <code>reduce()</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">trait</span><span class="w"> </span><span class="n">Reducer</span><span class="o">&lt;</span><span class="nb">Result</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="sd">/// Reduce two final results into one; this is executed after a
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="sd">/// split.
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">reduce</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">left</span>: <span class="nb">Result</span><span class="p">,</span><span class="w"> </span><span class="n">right</span>: <span class="nb">Result</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="generalizing-sum_producer">Generalizing <code>sum_producer()</code></h3>
<p>In effect, the consumer abstracts out the &ldquo;parallel operation&rdquo; that
the iterator is going to perform. Armed with this consumer trait, we
can now revisit the <code>sum_producer()</code> method we saw before. That function
was specific to adding up a series of values, but we&rsquo;d like to produce
an abstract version that works for any consumer. In the Rayon source,
<a href="https://github.com/nikomatsakis/rayon/blob/a0047facd2df584c771775bd8812c02f915e577c//src/par_iter/internal.rs#L170-L197">this function is called <code>bridge_producer_consumer</code></a>. Here is a
simplified version. It is helpful to compare it to <code>sum_producer()</code>
from before; I&rsquo;ll include some &ldquo;footnote comments&rdquo; (like <code>[1]</code>, <code>[2]</code>)
to highlight those differences.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// `sum_producer` was specific to summing up a series of `i32`
</span></span></span><span class="line"><span class="cl"><span class="c1">// values, which produced another `i32` value. This version is generic
</span></span></span><span class="line"><span class="cl"><span class="c1">// over any producer/consumer. The consumer consumes `P::Item` (whatever
</span></span></span><span class="line"><span class="cl"><span class="c1">// the producer produces) and then the fn as a whole returns a
</span></span></span><span class="line"><span class="cl"><span class="c1">// `C::Result`.
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">bridge_producer_consumer</span><span class="o">&lt;</span><span class="n">P</span><span class="p">,</span><span class="w"> </span><span class="n">C</span><span class="o">&gt;</span><span class="p">(</span><span class="n">len</span>: <span class="kt">usize</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                                  </span><span class="k">mut</span><span class="w"> </span><span class="n">producer</span>: <span class="nc">P</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                                  </span><span class="k">mut</span><span class="w"> </span><span class="n">consumer</span>: <span class="nc">C</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                                  </span>-&gt; <span class="nc">C</span>::<span class="nb">Result</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="n">P</span>: <span class="nc">Producer</span><span class="p">,</span><span class="w"> </span><span class="n">C</span>: <span class="nc">Consumer</span><span class="o">&lt;</span><span class="n">P</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">if</span><span class="w"> </span><span class="n">len</span><span class="w"> </span><span class="o">&gt;</span><span class="w"> </span><span class="no">THRESHOLD</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Input too large: divide it up
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">mid</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">len</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="mi">2</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// As before, split the producer into two halves at the mid-point.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="p">(</span><span class="n">left_producer</span><span class="p">,</span><span class="w"> </span><span class="n">right_producer</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">producer</span><span class="p">.</span><span class="n">split_at</span><span class="p">(</span><span class="n">mid</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Also divide the consumer into two consumers.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// This also gives us a *reducer* for later.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="p">(</span><span class="n">left_consumer</span><span class="p">,</span><span class="w"> </span><span class="n">right_consumer</span><span class="p">,</span><span class="w"> </span><span class="n">reducer</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">consumer</span><span class="p">.</span><span class="n">split_at</span><span class="p">(</span><span class="n">mid</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Parallelize the processing of the left/right halves,
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// producing two results.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="p">(</span><span class="n">left_result</span><span class="p">,</span><span class="w"> </span><span class="n">right_result</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="n">rayon</span>::<span class="n">join</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">||</span><span class="w"> </span><span class="n">bridge_producer_consumer</span><span class="p">(</span><span class="n">mid</span><span class="p">,</span><span class="w"> </span><span class="n">left_producer</span><span class="p">,</span><span class="w"> </span><span class="n">left_consumer</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">||</span><span class="w"> </span><span class="n">bridge_producer_consumer</span><span class="p">(</span><span class="n">len</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">mid</span><span class="p">,</span><span class="w"> </span><span class="n">right_producer</span><span class="p">,</span><span class="w"> </span><span class="n">right_consumer</span><span class="p">));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Finally, reduce the two intermediate results.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// In `sum_producer`, this was `left_result + right_result`,
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// but here we use the reducer.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">reducer</span><span class="p">.</span><span class="n">reduce</span><span class="p">(</span><span class="n">left_result</span><span class="p">,</span><span class="w"> </span><span class="n">right_result</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Input too small: process sequentially.
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Get a *folder* from the consumer.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// In `sum_producer`, this was `let mut sum = 0`.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">folder</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">consumer</span><span class="p">.</span><span class="n">into_folder</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Convert producer into sequential iterator.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Feed each item to the folder in turn.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// In `sum_producer`, this was `sum += item`.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="w"> </span><span class="n">item</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="n">producer</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="n">folder</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">folder</span><span class="p">.</span><span class="n">consume</span><span class="p">(</span><span class="n">item</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Convert the folder into a result.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// In `sum_producer`, this was just `sum`.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">folder</span><span class="p">.</span><span class="n">complete</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="implementing-the-consumer-for-sum">Implementing the consumer for <code>sum()</code></h3>
<p>Next, let&rsquo;s look at how one might implement the <code>sum</code> consumer, so
that we can use it with <code>bridge_producer_consumer()</code>. As before, we&rsquo;ll
just focus on a <code>sum</code> that works on <code>i32</code> values, to keep things
relatively simple. We&rsquo;ll start out by declaring a trio of three types
(consumer, folder, and reducer).</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">I32SumConsumer</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// This type requires no state. This will be important
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// in the next post!
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">I32SumFolder</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// Current sum thus far.
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">sum</span>: <span class="kt">i32</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">I32SumReducer</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// No state here either.
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Next, let&rsquo;s implement the <code>Consumer</code> trait for <code>I32SumConsumer</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Consumer</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">I32SumConsumer</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">type</span> <span class="nc">Folder</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">I32SumFolder</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">type</span> <span class="nc">Reducer</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">I32SumReducer</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">type</span> <span class="nb">Result</span> <span class="o">=</span><span class="w"> </span><span class="kt">i32</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// Since we have no state, &#34;splitting&#34; just means making some
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// empty structs:
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">split_at</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">_index</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="p">(</span><span class="bp">Self</span><span class="p">,</span><span class="w"> </span><span class="bp">Self</span><span class="p">,</span><span class="w"> </span><span class="bp">Self</span>::<span class="nb">Result</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">(</span><span class="n">I32SumConsumer</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">},</span><span class="w"> </span><span class="n">I32SumConsumer</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">},</span><span class="w"> </span><span class="n">I32SumReducer</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">})</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// Folder starts out with a sum of zero.
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">into_folder</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Folder</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">I32SumFolder</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">sum</span>: <span class="mi">0</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The folder is also very simple. It takes each value and
adds it to the current sum.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Folder</span><span class="o">&lt;</span><span class="kt">i32</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">I32SumFolder</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">type</span> <span class="nb">Result</span> <span class="o">=</span><span class="w"> </span><span class="kt">i32</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">consume</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">item</span>: <span class="kt">i32</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// we take ownership the current folder
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// at each step, and produce a new one
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// as the result:
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">I32SumFolder</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">sum</span>: <span class="nc">self</span><span class="p">.</span><span class="n">sum</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">item</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">complete</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">i32</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">self</span><span class="p">.</span><span class="n">sum</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>And, finally, the reducer just sums up two sums. The <code>self</code> goes
unused since our reducer doesn&rsquo;t have any state of its own.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Reducer</span><span class="o">&lt;</span><span class="kt">i32</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">I32SumFolder</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">reduce</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">left</span>: <span class="kt">i32</span><span class="p">,</span><span class="w"> </span><span class="n">right</span>: <span class="kt">i32</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">i32</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">left</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">right</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="implementing-the-consumer-for-collect">Implementing the consumer for <code>collect()</code></h3>
<p>Now that we&rsquo;ve built up this generic framework for consumers, let&rsquo;s
put it to use by defining a second consumer. This time I want to
define how <code>collect()</code> works; just like in sequential iterators,
<code>collect()</code> allows users to accumulate the parallel items into a
collection. In this case, we&rsquo;re going to examine one particular
variant of <code>collect()</code>, which writes values into a vector:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">c</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">_</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">vec1</span><span class="p">.</span><span class="n">par_iter</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="p">.</span><span class="n">zip</span><span class="p">(</span><span class="n">vec2</span><span class="p">.</span><span class="n">par_iter</span><span class="p">())</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="p">.</span><span class="n">map</span><span class="p">(</span><span class="o">|</span><span class="p">(</span><span class="n">i</span><span class="p">,</span><span class="w"> </span><span class="n">j</span><span class="p">)</span><span class="o">|</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">j</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="p">.</span><span class="n">collect</span><span class="p">();</span><span class="w"> </span><span class="c1">// &lt;-- only thing different
</span></span></span></code></pre></div><p>In fact, internally, Rayon&rsquo;s <code>collect()</code> for vectors is
<a href="https://github.com/nikomatsakis/rayon/blob/a0047facd2df584c771775bd8812c02f915e577c/src/par_iter/from_par_iter.rs#L39-L43">written in terms of a more efficient primitive</a>,
<code>collect_into()</code>. <code>collect_into()</code> takes a mutable reference to a
vector and stores the results in there: this allows you to re-use a
pre-existing vector and avoid allocation overheads. It&rsquo;s particularly
good for <a href="https://en.wikipedia.org/wiki/Double_buffering">double buffering</a> scenarios. To use <code>collect_into()</code>
explicitly, one would write something like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">c</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">_</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">vec1</span><span class="p">.</span><span class="n">par_iter</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="p">.</span><span class="n">zip</span><span class="p">(</span><span class="n">vec2</span><span class="p">.</span><span class="n">par_iter</span><span class="p">())</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="p">.</span><span class="n">map</span><span class="p">(</span><span class="o">|</span><span class="p">(</span><span class="n">i</span><span class="p">,</span><span class="w"> </span><span class="n">j</span><span class="p">)</span><span class="o">|</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">j</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="p">.</span><span class="n">collect_into</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">c</span><span class="p">);</span><span class="w">
</span></span></span></code></pre></div><p><code>collect_into()</code> first ensures that the vector has enough capacity for
the items in the iterator and then creates a particular consumer that,
for each item, will store it into the appropriate place in the vector.</p>
<p>We&rsquo;re going to walk through a simplified version of the
<code>collect_into()</code> consumer. This version will be specialized to vectors
of <code>i32</code> values; moreover, it&rsquo;s going to avoid any use of unsafe code
and just assume that the vector is initialized to the right length
(perhaps with <code>0</code> values). The <a href="https://github.com/nikomatsakis/rayon/blob/a0047facd2df584c771775bd8812c02f915e577c/src/par_iter/collect/consumer.rs">real version</a> works
for arbitrary types and avoids initialization by using a dab of unsafe
code (just about the only unsafe code in the parallel iterators part
of Rayon, actually).</p>
<p>Let&rsquo;s start with the type definitions for the consumer, folder, and
reducer. They look like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">I32CollectVecConsumer</span><span class="o">&lt;</span><span class="na">&#39;c</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">data</span>: <span class="kp">&amp;</span><span class="na">&#39;c</span> <span class="nc">mut</span><span class="w"> </span><span class="p">[</span><span class="kt">i32</span><span class="p">],</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">I32CollectVecFolder</span><span class="o">&lt;</span><span class="na">&#39;c</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">data</span>: <span class="kp">&amp;</span><span class="na">&#39;c</span> <span class="nc">mut</span><span class="w"> </span><span class="p">[</span><span class="kt">i32</span><span class="p">],</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">index</span>: <span class="kt">usize</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">I32SumReducer</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>These type definitions kind of suggest to you an outline for this is
going to work. When the consumer starts, it has a mutable slice of
integers that it will eventually store into (the <code>&amp;'c mut [i32]</code>); the
lifetime <code>'c</code> here represents the span of time in which the collection
is happening. Remember that in Rust a mutable reference is also a
<em>unique</em> reference, which means that we don&rsquo;t have to worry about
other threads reading or messing with our array while we store into
it.</p>
<p>When the time comes to switch to the folder, we still have a slice to
store into, but now we also have an index. That tracks how many items we
have stored thus far.</p>
<p>Finally, the reducer struct is empty, because once the values are
stored, there really isn&rsquo;t any data to reduce. For collect, the
reduction step will just be a no-op.</p>
<p>OK, let&rsquo;s see how the consumer trait is defined. The idea here is
simple: each time the consumer is split at some index <code>N</code>, it splits
its mutable slice into two halves at <code>N</code>, and returns two consumers, one with
each half:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="na">&#39;c</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Consumer</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">I32VecCollectConsumer</span><span class="o">&lt;</span><span class="na">&#39;c</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">type</span> <span class="nc">Folder</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">I32VecCollectFolder</span><span class="o">&lt;</span><span class="na">&#39;c</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">type</span> <span class="nc">Reducer</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">I32VecCollectReducer</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// The &#34;result&#34; of a `collect_into()` is just unit.
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// We are executing this for its side effects.
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">type</span> <span class="nb">Result</span> <span class="o">=</span><span class="w"> </span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">split_at</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">index</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="p">(</span><span class="bp">Self</span><span class="p">,</span><span class="w"> </span><span class="bp">Self</span><span class="p">,</span><span class="w"> </span><span class="bp">Self</span>::<span class="n">Reducer</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Divide the slice into two halves at `index`:
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="p">(</span><span class="n">left</span><span class="p">,</span><span class="w"> </span><span class="n">right</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="n">split_at_mut</span><span class="p">(</span><span class="n">index</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Construct the new consumers:
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">(</span><span class="n">I32VecCollectConsumer</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">data</span>: <span class="nc">left</span><span class="w"> </span><span class="p">},</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">     </span><span class="n">I32VecCollectConsumer</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">data</span>: <span class="nc">right</span><span class="w"> </span><span class="p">},</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">     </span><span class="n">I32VecCollectReducer</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">})</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// When we convert to a folder, give over the slice and start
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// the index at 0.
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">into_folder</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Folder</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">I32VecCollectFolder</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">data</span>: <span class="nc">self</span><span class="p">.</span><span class="n">data</span><span class="p">,</span><span class="w"> </span><span class="n">index</span>: <span class="mi">0</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The folder trait is also pretty simple. Each time we consume a new
integer, we&rsquo;ll store it into the slice and increment <code>index</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Folder</span><span class="o">&lt;</span><span class="kt">i32</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">I32SumFolder</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">type</span> <span class="nb">Result</span> <span class="o">=</span><span class="w"> </span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">consume</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">item</span>: <span class="kt">i32</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">self</span><span class="p">.</span><span class="n">data</span><span class="p">[</span><span class="bp">self</span><span class="p">.</span><span class="n">index</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">I32CollectVecFolder</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">data</span>: <span class="nc">self</span><span class="p">.</span><span class="n">data</span><span class="p">,</span><span class="w"> </span><span class="n">index</span>: <span class="nc">self</span><span class="p">.</span><span class="n">index</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="mi">1</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">complete</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Finally, since <code>collect_into()</code> has no result, the &ldquo;reduction&rdquo; step
is just a no-op:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Reducer</span><span class="o">&lt;</span><span class="p">()</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">I32CollectVecFolder</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">reduce</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">_left</span>: <span class="p">(),</span><span class="w"> </span><span class="n">_right</span>: <span class="p">())</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="conclusion">Conclusion</h3>
<p>This post continued our explanation of how Rayon&rsquo;s parallel iterators
work. Whereas the <a href="https://smallcultfollowing.com/babysteps/
/blog/2016/02/25/parallel-iterators-part-2-producers/">previous post</a> introduced parallel
producers, this post showed how we can abstract out <strong>parallel
consumers</strong> as well. Parallel consumers basically represent the
&ldquo;parallel actions&rdquo; at the end of a parallel iterator, like <code>sum()</code> or
<code>collect()</code>.</p>
<p>Using parallel consumers allows us to have one common routine,
<code>bridge_producer_consumer()</code>, that is used to draw items from a
producer and feed them to a consumer. This routine thus defines
precisely the parallel logic itself, independent from any particular
parallel iterator. In future posts, we&rsquo;ll discuss a bit how that same
routine can also use some adaptive techniques to try and moderate
splitting overhead automatically and dynamically.</p>
<p>I want to emphasize something about this post and the previous one:
you may have noticed a general lack of unsafe code. <strong>One of the very
cool things about Rayon is that the vast majority of the unsafety is
confined to the <code>join()</code> implementation.</strong> For the most part, the
parallel iterators just build on this new abstraction.</p>
<p>It is hard to overstate the benefits of confining unsafe code in this
way. For one thing, I&rsquo;ve caught a lot of bugs in the iterator code I
was writing. But even better, <strong>it means that it is relatively easy to
unit test and review parallel iterator PRs</strong>. We don&rsquo;t have to worry
about crazy data-race bugs that only crop up if we test for hours and
hours. It&rsquo;s enough to just make sure we use a variant of
<code>bridge_producer_consumer()</code> that splits very deeply, so that we test
the split/recombine logic.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/rayon" term="rayon" label="Rayon"/></entry><entry><title type="html">Associated type constructors, part 4: Unifying ATC and HKT</title><link href="https://smallcultfollowing.com/babysteps/blog/2016/11/09/associated-type-constructors-part-4-unifying-atc-and-hkt/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2016/11/09/associated-type-constructors-part-4-unifying-atc-and-hkt/</id><published>2016-11-09T00:00:00+00:00</published><updated>2016-11-09T00:00:00+00:00</updated><content type="html"><![CDATA[<p>This post is a continuation of my posts discussing the topic of
associated type constructors (ATC) and higher-kinded types (HKT):</p>
<ol>
<li><a href="https://smallcultfollowing.com/babysteps/
/blog/2016/11/02/associated-type-constructors-part-1-basic-concepts-and-introduction/">The first post</a> focused on introducing the basic idea of
ATC, as well as introducing some background material.</li>
<li><a href="https://smallcultfollowing.com/babysteps/
/blog/2016/11/03/associated-type-constructors-part-2-family-traits/">The second post</a> showed how we can use ATC to model HKT,
via the &ldquo;family&rdquo; pattern.</li>
<li><a href="https://smallcultfollowing.com/babysteps/
/blog/2016/11/04/associated-type-constructors-part-3-what-higher-kinded-types-might-look-like/">The third post</a> did some exploration into what it would
mean to support HKT directly in the language, instead of modeling
them via the family pattern.</li>
<li>This post considers what it might mean if we had both ATC <em>and</em> HKT
in the language: in particular, whether those two concepts can be
unified, and at what cost.</li>
</ol>
<!-- more -->
<h3 id="unifying-hkt-and-atc">Unifying HKT and ATC</h3>
<p>So far we have seen &ldquo;associated-type constructors&rdquo; and &ldquo;higher-kinded
types&rdquo; as two distinct concepts. The question is, would it make sense
to try and <em>unify</em> these two, and what would that even mean?</p>
<p>Consider this trait definition:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Iterable</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Iter</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span>: <span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">=</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">iter</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="na">&#39;a</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Iter</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>In the ATC world-view, this trait definition would mean that you can
now specify a type like the following</p>
<pre tabindex="0"><code>&lt;T as Iterable&gt;::Iter&lt;&#39;a&gt;
</code></pre><p>Depending on what the type <code>T</code> and lifetime <code>'a</code> are, this might get
&ldquo;normalized&rdquo;. Normalization basically means to expand an associated
type reference using the types given in the appropriate impl. For
example, we might have an impl like the following:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Iterable</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Vec</span><span class="o">&lt;</span><span class="n">A</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">A</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Iter</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">std</span>::<span class="n">vec</span>::<span class="n">Iter</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="p">,</span><span class="w"> </span><span class="n">A</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">iter</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="na">&#39;a</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Iter</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">clone</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>In that case, <code>&lt;Vec&lt;Foo&gt; as Iterable&gt;::Iter&lt;'x&gt;</code> could be <em>normalized</em>
to <code>std::vec::Iter&lt;'x, Foo&gt;</code>. This is basically exactly the same way
that associated type normalization works now, except that we have
additional type/lifetime parameters that are placed on the associated
item itself, rather than having all the parameters come from the trait
reference.</p>
<h4 id="associated-type-constructors-as-functions">Associated type constructors as functions</h4>
<p>Another way to view an ATC is as a kind of function, where the
normalization process plays the role of evaluating the function when
applied to various arguments. In that light, <code>&lt;Vec&lt;Foo&gt; as Iterable&gt;::Iter</code> could be viewed as a &ldquo;type function&rdquo; with a signature
like <code>lifetime -&gt; type</code>; that is, a function which, given a type and a
lifetime, produces a type:</p>
<pre tabindex="0"><code>&lt;Vec&lt;Foo&gt; as Iterable&gt;::Iter&lt;&#39;x&gt;
^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^
function                     argument
</code></pre><p>When I write it this way, it&rsquo;s natural to ask how such a function is
related to a <em>higher-kinded type</em>. After all, <code>lifetime -&gt; type</code> could
also be a <em>kind</em>, right? So perhaps we should think of <code>&lt;Vec&lt;Foo&gt; as Iterable&gt;::Iter</code> as a type of kind <code>lifetime -&gt; type</code>? What would that mean?</p>
<h3 id="limitations-on-what-can-be-used-in-an-atc-declaration">Limitations on what can be used in an ATC declaration</h3>
<p>Well, in the last post, we saw that, in order to ensure that inference
is tractable, HKT in Haskell comes with pretty strict limitations on
the kinds of &ldquo;type functions&rdquo; we can support. Whatever we chose to
adopt in Rust, it would imply that we need similar limitations on ATC
values that can be treated as higher-kinded.</p>
<p>That wouldn&rsquo;t affect the impl of <code>Iterable</code> for <code>Vec&lt;A&gt;</code> that we saw
earlier. But imagine that we wanted <code>Range&lt;i32&gt;</code>, which is the type
produced by <code>0..22</code>, to act as an <code>Iterable</code>. Now, ranges like <code>0..22</code>
are <em>already</em> iterable &ndash; so the type of an iterator could just be
<code>Self</code>, and <code>iter()</code> can effectively just be <code>clone()</code>. So you might
think you could just write:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="kt">u32</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Iterable</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Range</span><span class="o">&lt;</span><span class="kt">u32</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kt">u32</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Iter</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Range</span><span class="o">&lt;</span><span class="kt">u32</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//              ^^^^ doesn&#39;t use `&#39;a&#39;` at all
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">iter</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Range</span><span class="o">&lt;</span><span class="kt">u32</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">*</span><span class="bp">self</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>However, this impl would be illegal, because <code>Range&lt;u32&gt;</code> doesn&rsquo;t use
the parameter <code>'a</code>. Presuming we adopted the rule I suggested in the
previous post, every value for <code>Iter&lt;'a&gt;</code> would have to use the <code>'a</code>
exactly once, as the first lifetime argument.  So <code>Foo&lt;'a, u32&gt;</code> would
be ok, as would <code>&amp;'a Bar</code>, but <code>Baz&lt;'static, 'a&gt;</code> would not.</p>
<h3 id="working-around-this-limitation-with-newtypes">Working around this limitation with newtypes</h3>
<p>You could work around this limitation above by introducing a newtype.
Something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">RangeIter</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">range</span>: <span class="nc">Range</span><span class="o">&lt;</span><span class="kt">u32</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">dummy</span>: <span class="nc">PhantomData</span><span class="o">&lt;&amp;</span><span class="na">&#39;a</span><span class="w"> </span><span class="p">()</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//                  ^^ need to use `&#39;a` somewhere
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>We can then implement <code>Iterator</code> for <code>RangeIter&lt;'a&gt;</code> and just proxy
<code>next()</code> on to <code>self.range.next()</code>. But this is kind of a drag.</p>
<h3 id="an-alternative-give-users-the-choice">An alternative: give users the choice</h3>
<p>For a long time, I had assumed that if we were going to introduce HKT,
we would do so by letting users define the kinds more explicitly. So,
for example, if we wanted the member <code>Iter</code> to be of kind <code>lifetime -&gt; type</code>, we might declare that explicitly.  Using the <code>&lt;_&gt;</code> and <code>&lt;'_&gt;</code>
notation I was using in earlier posts, that might look like this:</p>
<pre tabindex="0"><code>trait Iterable {
    type Iter&lt;&#39;_&gt;;
}
</code></pre><p>Now the trait has declared that impls must supply a valid, partially
applied struct/enum name as the value for <code>Iter</code>.</p>
<p>I&rsquo;ve somewhat soured on this idea, for a variety of reasons. One big
one is that we are forcing trait users to mak this choice up front,
when it may not be obvious whether a HKT or an ATC is the better fit.
And of course it&rsquo;s a complexity cost: now there are two things to
understand.</p>
<p>Finally, now that I realize that HKT is going to require bounds, not
having names for things means it&rsquo;s hard to see how we&rsquo;re going to
declare those bounds. In fact, even the <code>Iterable</code> trait probably has
some bounds; you can&rsquo;t just use <strong>any old</strong> lifetime for the
iterator. So really the trait probably includes a condition that
<code>Self: 'iter</code>, meaning that the iterable thing must outlive the
duration of the iteration:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Iterable</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Iter</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="o">&gt;</span>: <span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">=</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">where</span><span class="w"> </span><span class="bp">Self</span>: <span class="na">&#39;iter</span><span class="p">;</span><span class="w"> </span><span class="c1">// &lt;-- bound I was missing before
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">iter</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="o">&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="na">&#39;iter</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Iter</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="why-focus-on-associated-items">Why focus on associated items?</h3>
<p>You might wonder why I said that we should consider <code>&lt;T as Iterable&gt;::Iter</code> to have type <code>lifetime -&gt; type</code> rather than saying
that <code>Iterable::Iter</code> would be something of kind <code>type -&gt; lifetime -&gt; type</code>. In other words, what about the input types to the trait itself?</p>
<p>It turns out that this idea doesn&rsquo;t really make sense. First off, it
would naturally affect existing associated types. So <code>Iterator::Item</code>,
for example, would be something of kind <code>type -&gt; type</code>, where the
argument is the type of the iterator. <code>&lt;Range&lt;u32&gt; as Iterator&gt;::Item</code>
would be the syntax for <em>applying</em> <code>Iterator::Item</code> to <code>Range&lt;u32&gt;</code>.
Since we can write generic functions with higher-kinded parameters
like <code>fn foo&lt;I&lt;_&gt;&gt;()</code>, that means that <code>I</code> here might be
<code>Iterator::Item</code>, and hence <code>I&lt;Range&lt;u32&gt;&gt;</code> would be equivalent to
<code>&lt;Range&lt;u32&gt; as Iterator&gt;::Item</code>.</p>
<p>But remember that, to make inference tractable, we want to know that
<code>?X&lt;Foo&gt; = ?Y&lt;Foo&gt;</code> if and only if <code>?X = ?Y</code>. That means that we could
not allow <code>&lt;Range&lt;u32&gt; as Iterator&gt;::Item</code> to normalize to the same
thing as <code>&lt;Range&lt;u32&gt; as SomeOtherTrait&gt;::Foo</code>. You can see that this
doesn&rsquo;t even remotely resemble associated types as we know them, which
are just plain one-way functions.</p>
<h3 id="conclusions">Conclusions</h3>
<p>This is kind of the &ldquo;capstone&rdquo; post for the series that I set out to
write.  I&rsquo;ve tried to give an overview of
<a href="https://smallcultfollowing.com/babysteps/
/blog/2016/11/02/associated-type-constructors-part-1-basic-concepts-and-introduction/">what associated type constructors are</a>; the
<a href="https://smallcultfollowing.com/babysteps/
/blog/2016/11/03/associated-type-constructors-part-2-family-traits/">ways that they can model higher-kinded patterns</a>;
<a href="https://smallcultfollowing.com/babysteps/
/blog/2016/11/04/associated-type-constructors-part-3-what-higher-kinded-types-might-look-like/">what higher-kinded types are</a>; and now what it might mean if
we tried to combine the two ideas.</p>
<p>I hope to continue this series a bit further, though, and in
particular to try and explore some case studies and further
thoughts. If you&rsquo;re interested in the topic, I strongly encourage you
to
<a href="https://internals.rust-lang.org/t/blog-post-series-alternative-type-constructors-and-hkt/4300/">hop over to the internals thread</a>
and take a look. There have been a lot of insightful comments there.</p>
<p>That said, currently my thinking is this:</p>
<ul>
<li>Associated type constructors are a natural extension to the
language. They &ldquo;fit right in&rdquo; syntactically with associated types.</li>
<li>Despite that, ATC would represent a huge step up in expressiveness,
and open the door to richer traits. This could be particularly important
for many libraries, such as futures.
<ul>
<li>I know that Rayon had to bend over backwards in some places because we lack
any way to express an &ldquo;iterable-like&rdquo; pattern.</li>
</ul>
</li>
<li>Higher-kinded types as expressed in Haskell are not very suitable for Rust:
<ul>
<li>they don&rsquo;t cover bounds, which we need;</li>
<li>the limitation to &ldquo;partially applied&rdquo; struct/enum names is not a natural fit,
even if we loosen it somewhat.</li>
</ul>
</li>
<li>Moreover, adding HKT to the language would be a big complexity jump:
<ul>
<li>to use Rust, you already have to understand associated types, and ATC is not much more;</li>
<li>but adding to that rules and associated syntax for HKT feels like
a lot to ask.</li>
</ul>
</li>
</ul>
<p><strong>So currently I lean towards accepting ATC with no restrictions and
modeling HKT using families.</strong> That said, I agree that the potential
to feel like a lot of &ldquo;boilerplate&rdquo;.  I sort of suspect that, in
practice, HKT would require a fair amount of its own boilerplate (i.e,
to abstract away bounds and so forth), and/or not be suitable for
Rust, but perhaps further exploration of example use-cases will be
instructive in this regard.</p>
<h3 id="comments">Comments</h3>
<p>Please leave comments on <a href="https://internals.rust-lang.org/t/blog-post-series-alternative-type-constructors-and-hkt/4300/">this internals thread</a>.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/atc" term="atc" label="ATC"/><category scheme="https://smallcultfollowing.com/babysteps/categories/hkt" term="hkt" label="HKT"/><category scheme="https://smallcultfollowing.com/babysteps/categories/traits" term="traits" label="Traits"/></entry><entry><title type="html">Associated type constructors, part 3: What higher-kinded types might look like</title><link href="https://smallcultfollowing.com/babysteps/blog/2016/11/04/associated-type-constructors-part-3-what-higher-kinded-types-might-look-like/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2016/11/04/associated-type-constructors-part-3-what-higher-kinded-types-might-look-like/</id><published>2016-11-04T00:00:00+00:00</published><updated>2016-11-04T00:00:00+00:00</updated><content type="html"><![CDATA[<p>This post is a continuation of my posts discussing the topic of
associated type constructors (ATC) and higher-kinded types (HKT):</p>
<ol>
<li><a href="https://smallcultfollowing.com/babysteps/
/blog/2016/11/02/associated-type-constructors-part-1-basic-concepts-and-introduction/">The first post</a> focused on introducing the basic idea of
ATC, as well as introducing some background material.</li>
<li><a href="https://smallcultfollowing.com/babysteps/
/blog/2016/11/03/associated-type-constructors-part-2-family-traits/">The second post</a> showed how we can use ATC to model HKT,
via the &ldquo;family&rdquo; pattern.</li>
<li>This post dives into what it would mean to support HKT directly
in the language, instead of modeling them via the family pattern.</li>
</ol>
<!-- more -->
<h3 id="the-story-thus-far-a-quick-recap">The story thus far (a quick recap)</h3>
<p>In the previous posts, we had introduced a basic <code>Collection</code> trait
that used ATC to support an <code>iterate()</code> method:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Collection</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">empty</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">Self</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">add</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">value</span>: <span class="nc">Item</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">iterate</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="o">&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="na">&#39;iter</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Iter</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Iter</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="o">&gt;</span>: <span class="nc">Iterable</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">=&amp;</span><span class="na">&#39;iter</span><span class="w"> </span><span class="n">Item</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>And then we were discussing this function <code>floatify</code>, which converts a
collection of integers to a collection of floats. We started with a
basic version using ATC:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">floatify</span><span class="o">&lt;</span><span class="n">I</span><span class="p">,</span><span class="w"> </span><span class="n">F</span><span class="o">&gt;</span><span class="p">(</span><span class="n">ints</span>: <span class="kp">&amp;</span><span class="nc">I</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">F</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="n">I</span>: <span class="nc">Collection</span><span class="o">&lt;</span><span class="kt">i32</span><span class="o">&gt;</span><span class="p">,</span><span class="w"> </span><span class="n">F</span>: <span class="nc">Collection</span><span class="o">&lt;</span><span class="kt">f32</span><span class="o">&gt;</span><span class="w">
</span></span></span></code></pre></div><p>However, this version does not constrain the inputs and outputs to be
the same &ldquo;sort&rdquo; of collection. For example, it can be used to convert
a <code>Vec&lt;i32&gt;</code> to a <code>List&lt;f32&gt;</code>. Sometimes that is desirable, but maybe
not. To compensate, we augmented <code>Collection</code> with an associated
&ldquo;family&rdquo; trait, so that if we have (say) a <code>Foo&lt;i32&gt;</code>, we can convert
to a <code>Foo&lt;f32&gt;</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Collection</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="c1">// as before
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Family</span>: <span class="nc">CollectionFamily</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">CollectionFamily</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Coll</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&gt;</span>: <span class="nc">Collection</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This let us write a <code>floatify_family</code> like so, which does enforce that
the input and output collections belong to the same &ldquo;family&rdquo;:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">floatify_family</span><span class="o">&lt;</span><span class="n">C</span><span class="o">&gt;</span><span class="p">(</span><span class="n">ints</span>: <span class="kp">&amp;</span><span class="nc">C</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">C</span>::<span class="n">Family</span>::<span class="n">Member</span><span class="o">&lt;</span><span class="kt">f32</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="n">C</span>: <span class="nc">Collection</span><span class="o">&lt;</span><span class="kt">i32</span><span class="o">&gt;</span><span class="w"> </span><span class="c1">//    ^^^^^^^^^^^^^^^^^ another collection,
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">                            </span><span class="c1">//                      in same family
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">    
</span></span></span></code></pre></div><p>A common question in response to the previous post was whether the
<code>CollectionFamily</code> trait was actually <em>necessary</em>. The answer is that
it is not, one could also have augmented the <code>Collection</code> trait to
just have a <code>Sibling</code> member:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Collection</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Sibling</span><span class="o">&lt;</span><span class="n">AnotherItem</span><span class="o">&gt;</span>: <span class="nc">Collection</span><span class="o">&lt;</span><span class="n">AnotherItem</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>And then we could write <code>floatify_sibling</code> as follows:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">floatify_sibling</span><span class="o">&lt;</span><span class="n">C</span><span class="o">&gt;</span><span class="p">(</span><span class="n">ints</span>: <span class="kp">&amp;</span><span class="nc">C</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">C</span>::<span class="n">Sibling</span><span class="o">&lt;</span><span class="kt">f32</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="n">C</span>: <span class="nc">Collection</span><span class="o">&lt;</span><span class="kt">i32</span><span class="o">&gt;</span><span class="w"> </span><span class="c1">//     ^^^^^^^^^^^^^^^ another collection,
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">                            </span><span class="c1">//                     in same family
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">    
</span></span></span></code></pre></div><p>For some more thoughts on that, see <a href="https://internals.rust-lang.org/t/blog-post-series-alternative-type-constructors-and-hkt/4300/28?u=nikomatsakis">my comment on internals</a>.</p>
<p>In any case, where I want to go today is to start exploring what it
might mean to encode this family pattern directly into the language
itself. This is what people typically mean when they talk about
<em>higher-kinded types</em>.</p>
<h3 id="supporting-families-directly-in-the-language-via-hkt">Supporting families directly in the language via HKT</h3>
<p>The family trait idea is very powerful, but in a way it&rsquo;s a bit
indirect. Now for each collection type (e.g., <code>List&lt;T&gt;</code>), we wind up
adding another &ldquo;family type&rdquo; (<code>ListFamily</code>) that effectively
corresponds to the <code>List</code> part without the <code>&lt;T&gt;</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">List</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Collection</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">List</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="k">type</span> <span class="nc">Family</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ListFamily</span><span class="p">;</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">ListFamily</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">CollectionFamily</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">ListFamily</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The idea of HKT is that can make it possible to just refer to <code>List</code>
(without proving a <code>&lt;T&gt;</code>), instead of introducing a &ldquo;family type&rdquo;.
So for example we might write <code>floatify_hkt()</code> like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">floatify_hkt</span><span class="o">&lt;</span><span class="n">I</span><span class="o">&lt;</span><span class="n">_</span><span class="o">&gt;&gt;</span><span class="p">(</span><span class="n">ints</span>: <span class="kp">&amp;</span><span class="nc">I</span><span class="o">&lt;</span><span class="kt">i32</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">I</span><span class="o">&lt;</span><span class="kt">f32</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">//              ^^^^ the notation `I&lt;_&gt;` signals that `I` is
</span></span></span><span class="line"><span class="cl"><span class="c1">//                   not a complete type
</span></span></span></code></pre></div><p>Here you see that we declared a different kind of parameter <code>I</code> &ndash;
normally <code>I</code> would represent a complete type, like <code>List&lt;i32&gt;</code>. But
because we wrote <code>I&lt;_&gt;</code> (I&rsquo;m pilfering a bit from
<a href="http://blogs.atlassian.com/2013/09/scala-types-of-a-higher-kind/">Scala&rsquo;s syntax</a> here), we have declared that <code>I</code> represents a
<em>type constructor</em>, meaning something like <code>List</code>. To be a bit more
explicit, I&rsquo;m going to write <code>List&lt;_&gt;</code>, where the <code>_</code> indicates an
&ldquo;unspecified&rdquo; type parameter.</p>
<p>So this signature is effectively saying that it takes as input a
<code>I&lt;i32&gt;</code> (for some <code>I</code>) and returns an <code>I&lt;f32&gt;</code> &ndash; the intention is to
mean that it takes a collection of integers and returns the same sort
of collection, but applied to floats (so, e.g., <code>I</code> might be mapped to
<code>List&lt;_&gt;</code> or <code>Vec&lt;_&gt;</code>, yielding <code>List&lt;i32&gt;/List&lt;f32&gt;</code> or
<code>Vec&lt;i32&gt;/Vec&lt;f32&gt;</code> respectively). But is that what it really says?
It turns out that this question is bit more subtle than you might
think; let&rsquo;s dig in.</p>
<h3 id="trait-bounds-higher-ranked-and-otherwise">Trait bounds, higher-ranked and otherwise</h3>
<p>The first thing to notice is that <code>floatify_hkt()</code> is missing some
where-clauses. In particular, nowhere do we declare that <code>I&lt;i32&gt;</code> is
supposed to be a collection. To do that, we would need something like
this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">floatify_hkt</span><span class="o">&lt;</span><span class="n">I</span><span class="o">&lt;</span><span class="n">_</span><span class="o">&gt;&gt;</span><span class="p">(</span><span class="n">ints</span>: <span class="kp">&amp;</span><span class="nc">I</span><span class="o">&lt;</span><span class="kt">i32</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">I</span><span class="o">&lt;</span><span class="kt">f32</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="k">for</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">I</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span>: <span class="nc">Collection</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//    ^^^^^^^^^^^^^^^^^^^^^^^^^^ &#34;higher-ranked trait bound&#34;
</span></span></span></code></pre></div><p>Here I am using the &ldquo;higher-ranked trait bounds (HRTB) applied to types&rdquo;
introduced by <a href="https://github.com/rust-lang/rfcs/pull/1598">RFC 1598</a>, and discussed in
<a href="https://smallcultfollowing.com/babysteps/
/blog/2016/11/03/associated-type-constructors-part-2-family-traits/">the previous post</a>. Basically we are saying that <code>I&lt;T&gt;</code> is
always a <code>Collection</code>, regardless of what <code>T</code> is.</p>
<p>So we just saw that we need HRTB to declare that any type <code>I&lt;T&gt;</code> is a
collection (otherwise, we just know it is some type). But (as far as I
know) Haskell doesn&rsquo;t have anything like HRTB &ndash; in Haskell, trait
bounds cannot be higher-ranked, so you could only write a declaration
that uses explicit types, like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">floatify_hkt</span><span class="o">&lt;</span><span class="n">I</span><span class="o">&lt;</span><span class="n">_</span><span class="o">&gt;&gt;</span><span class="p">(</span><span class="n">ints</span>: <span class="kp">&amp;</span><span class="nc">I</span><span class="o">&lt;</span><span class="kt">i32</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">I</span><span class="o">&lt;</span><span class="kt">f32</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="n">I</span><span class="o">&lt;</span><span class="kt">i32</span><span class="o">&gt;</span>: <span class="nc">Collection</span><span class="o">&lt;</span><span class="kt">i32</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="n">I</span><span class="o">&lt;</span><span class="kt">f32</span><span class="o">&gt;</span>: <span class="nc">Collection</span><span class="o">&lt;</span><span class="kt">f32</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span></code></pre></div><p>In this case, that&rsquo;s a perfectly adequate declaration. But in some
cases, being forced to write out explicit types like this can cause
you to expose information in your interface you might otherwise prefer
to keep secret. Consider this function <code>process()</code><sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>, which takes a
collection of inputs (of type <code>Input</code>) and returns a collection of
outputs (of type &ndash; wait for it &ndash; <code>Output</code>). The interesting thing
about this function is that, internally, it creates a temporary
collection of some intermediate type called <code>MyType</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">process</span><span class="o">&lt;</span><span class="n">I</span><span class="o">&lt;</span><span class="n">_</span><span class="o">&gt;&gt;</span><span class="p">(</span><span class="n">inputs</span>: <span class="kp">&amp;</span><span class="nc">I</span><span class="o">&lt;</span><span class="n">Input</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">I</span><span class="o">&lt;</span><span class="n">Output</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="n">I</span><span class="o">&lt;</span><span class="n">Input</span><span class="o">&gt;</span>: <span class="nc">Collection</span><span class="o">&lt;</span><span class="n">Input</span><span class="o">&gt;</span><span class="p">,</span><span class="w"> </span><span class="n">I</span><span class="o">&lt;</span><span class="n">Output</span><span class="o">&gt;</span>: <span class="nc">Collection</span><span class="o">&lt;</span><span class="n">Output</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">struct</span> <span class="nc">MyType</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// create an intermediate collection for some reason or other
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">shapes</span>: <span class="nc">I</span><span class="o">&lt;</span><span class="n">MyType</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">points</span><span class="p">.</span><span class="n">iter</span><span class="p">().</span><span class="n">map</span><span class="p">(</span><span class="o">|</span><span class="n">p</span><span class="o">|</span><span class="w"> </span><span class="o">..</span><span class="p">.).</span><span class="n">collect</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//              ^^^^^^^^ wait, how do I know I&lt;MyType&gt; is a collection?
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now you can see the problem! We know that <code>I&lt;Input&gt;</code> is a collection,
and we know that <code>I&lt;Output&gt;</code> is a collection, but without some form of
HRTB, we can&rsquo;t declare that <code>I&lt;MyType&gt;</code> is a collection without moving
<code>MyType</code> outside of the fn body<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>. So being able to say something like
&ldquo;<code>I&lt;T&gt;</code> is a collection no matter what <code>T</code> is&rdquo; is actually crucial to
our ability to encapsulate the internal processing that we are doing.</p>
<p>So, if Haskell lacks HRTB, how do they handle a case like this anyway?</p>
<h3 id="higher-kinded-self-types">Higher-kinded self types</h3>
<p>If you have higher-kinded types at your disposal, you can use them to
achieve something very similar to higher-ranked trait bounds, but we
would have to change how we defined our <code>Collection</code> trait. Currently,
we have a trait <code>Collection&lt;T&gt;</code> which is defined for some collection
type <code>C</code>; the type <code>C</code> is then considered a collection of items of
type <code>T</code>. So for example <code>C</code> might be <code>List&lt;Foo&gt;</code> (in which case <code>T</code>
would be <code>Foo</code>). The new idea would be to redefine <code>Collection</code> to be
defined over <em>collection type constructors</em> (like <code>List&lt;_&gt;</code>). So we
might write something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">HkCollection</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="bp">Self</span><span class="o">&lt;</span><span class="n">_</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">//    ^^               ^^^^^^^ declare that `Self` is a type constructor
</span></span></span><span class="line"><span class="cl"><span class="c1">//    stands for &#34;higher-kinded&#34;
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">empty</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">Self</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">add</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="bp">self</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="bp">Self</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">,</span><span class="w"> </span><span class="n">value</span>: <span class="nc">T</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//    ^^^ the `T` effectively moved from the trait to the methods
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now I might implement this not for <code>List&lt;T&gt;</code> but rather for <code>List&lt;_&gt;</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">HkCollection</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">List</span><span class="o">&lt;</span><span class="n">_</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">empty</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">List</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">List</span>::<span class="n">new</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>And, finally, instead of writing <code>where for&lt;T&gt; I&lt;T&gt;: Collection&lt;T&gt;</code>,
we can write <code>where I: HkCollection</code>. Note that here I bounded <code>I</code>,
not <code>I&lt;_&gt;</code>, since I am applying this trait not to any particular
<em>type</em>, but rather to the type <em>constructor</em>.</p>
<p>At first it may appear that these two setups are analogous, but <strong>it
turns out that the &ldquo;higher-kinded self types&rdquo; approach has some pretty
big limitations</strong>. Perhaps the most obvious is that it rules out
collections like <code>BitSet</code>, which can only store values of one particular
type:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">HkCollection</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">BitSet</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">//                    ^^^^^^ not a type constructor
</span></span></span></code></pre></div><p>Note that with the older, non-higher-kinded collection trait, we could
easily do something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Collection</span><span class="o">&lt;</span><span class="kt">usize</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">BitSet</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The same problem also confronts collections like <code>HashSet</code> or
<code>BTreeSet</code> that require bounds &ndash; that is, even though these are
generic types, you can&rsquo;t actually make a <code>HashSet</code> of just any old
type <code>T</code>. It must be a <code>T: Hash</code>. In other words, when I write
something like <code>Self&lt;_&gt;</code>, I am actually leaving out some important
information about what kinds of types the <code>_</code> can be:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">HkCollection</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">HashSet</span><span class="o">&lt;</span><span class="n">_</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">//                     ^^^^^^^ how can we restrict `_` to `Hash` types?
</span></span></span></code></pre></div><p>In Haskell, at least, if I have a HKT, that means I can apply this
type constructor to <strong>any</strong> type and get a result. But all collections
in Rust tend to apply <em>some</em> bounds on that. For example, <code>Vec&lt;T&gt;</code> and
<code>List&lt;T&gt;</code> both (implicitly) require that <code>T: Sized</code>. Or, if you have
<code>HashMap&lt;K,V&gt;</code>, you might consider it to be a collection of pairs <code>(K, V)</code>, except that it only works if <code>K: Hash + Eq + Sized</code> and <code>V: Sized</code>.</p>
<p>So, really, if we did want to support a syntax like <code>Foo&lt;_&gt;</code>, we would
actually need some way of constraining this <code>_</code>.</p>
<h3 id="spjs-type-classes-exploring-the-design-space">SPJ&rsquo;s &ldquo;Type Classes: Exploring the Design Space&rdquo;</h3>
<p>Naturally, Haskell has encountered all of these problems as well.  One
of my favorite papers is
<a href="http://research.microsoft.com/en-us/um/people/simonpj/Papers/type-class-design-space/">&ldquo;Type Classes: Exploring the Design Space&rdquo; by Jones et al.</a>,
published way back in 1997. They motivate &ldquo;multiparameter type
classes&rdquo; (which in Rust would be &ldquo;generic traits&rdquo; like
<code>Collection&lt;T&gt;</code>) by reviewing the various shortcomings of traits
defined with a higher-kinded <code>Self</code> type (like <code>HkCollection</code>):</p>
<ul>
<li>Section 2.1, &ldquo;Overloading with coupled parameters&rdquo; basically talks
about the idea that impls might not always apply to <em>all</em> types.  So
something like <code>impl Collection&lt;usize&gt; for BitSet</code> is a simple
example &ndash; if you choose the &ldquo;collection family&rdquo; to be <code>BitSet</code>, you
can then forced to pick <code>usize</code> as your element type.
<ul>
<li>In these situations, it is often (but not always) the case that
the &ldquo;second&rdquo; parameter could (and perhaps should) be an associated
type. For example, we might have changed <code>trait Collection&lt;Item&gt; { ... }</code> to <code>trait Collection { type Item; ... }</code>. This would have
meant that, for any given collection type, there is a fixed <code>Item</code>
type.
<ul>
<li>So, for example, the <code>BitSet</code> imply that applied to any integral
type would be illegal, because the type <code>BitSet</code> alone does not
define the item type <code>T</code>:
<ul>
<li><code>impl&lt;T: Integer&gt; Collection for BitSet { type Item = T; ... }</code></li>
</ul>
</li>
<li>I talked some about this tradeoff in the &ldquo;Things I learned&rdquo;
section from <a href="http://smallcultfollowing.com/babysteps/blog/2016/02/25/parallel-iterators-part-2-producers/">my post on Rayon</a>; the rule of thumb I
describe there seems to suggest <code>Collection&lt;T&gt;</code> would be better,
though I think you could argue it the other way. We&rsquo;ll have to
experiment.</li>
</ul>
</li>
</ul>
</li>
<li>Section 2.2, &ldquo;Overloading with constrained parameters&rdquo; covers the
problem of wanting constraints like <code>T: Sized</code> or <code>T: Hash</code>. In
Haskell, the <code>Sized</code> bound isn&rsquo;t necessary, but certainly things
like <code>HashSet&lt;T&gt;</code> wanting <code>T: Hash</code> still applies.</li>
</ul>
<p>Obviously this paper is pretty old (1997!), and a lot of new things in
Haskell have been developed since then (e.g., I think the paper
predates associated types in Haskell). I think this core tradeoff is
still relevant, however. Let me know though if you think I&rsquo;m out of
date and I need to read up on feature X which tries to address this
trade-off. (For example, is there any treatment of higher-kinded types
5Bthat adds the ability to constrain parameters in some way?)</p>
<h3 id="time-to-get-a-bit-more-formal">Time to get a bit more formal</h3>
<p>OK, I want to get a bit more formal in terms of how I am talking about
HKT. In particular, I want to talk more about what a <em>kind</em> is and why
we could call a type constructor like <code>List&lt;_&gt;</code> <em>higher</em>-kinded. The
idea is that just like types tell us what sort of value we have (e.g.,
<code>i32</code> vs <code>f32</code>), kinds tell us what sort of <em>generic parameter</em> we have.</p>
<p>In fact, Rust already has two kinds: lifetimes and types. Consider the
item <code>ListIter</code> that we saw earlier:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">ListIter</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="err">`</span><span class="w">
</span></span></span></code></pre></div><p>Here we see that there are two parameters, <code>'iter</code> and <code>T</code>, and the
first one represents a lifetime and the second a type. Let&rsquo;s say that
<code>'iter</code> has the kind <code>lifetime</code> and <code>T</code> has the kind <code>type</code> (in
Haskell, people would write <code>type</code> as <code>*</code>) .</p>
<p>Now what is the kind of <code>ListIter&lt;'foo, i32&gt;</code>? This is also a <code>type</code>.</p>
<p>So what is the kind of a type constructor like <code>ListIter&lt;'foo, _&gt;</code>?
This is something which, if you give it a type, you get a type. That
sounds like a function, right? Well, the idea is to write that kind as
<code>type -&gt; type</code>.</p>
<p>And so higher-kinded type parameters <em>are</em> kind of like functions,
except that instead of calling them at runtime (<code>foo(22)</code>), you
<em>apply</em> them to types (<code>Foo&lt;i32&gt;</code>). In general, when we can talk about
something &ldquo;callable&rdquo;, we tend to call it &ldquo;higher-&rdquo;, so in this case we
say &ldquo;higher-kinded&rdquo;.</p>
<p>You can also imagine higher-kinded type parameters that abstract over
lifetimes. We might write this like <code>ListIter&lt;'_, i32&gt;</code>, which would
correspond to the kind <code>lifetime -&gt; type</code>. If you had a parameter
<code>I&lt;'_&gt;</code>, then you could apply it like <code>I&lt;'foo&gt;</code>, and &ndash; assuming <code>I = ListIter&lt;'_, i32&gt;</code> &ndash; you would get <code>ListIter&lt;'foo, i32&gt;</code>.</p>
<p>Speaking more generally, we can say that the <em>kind</em> <code>K</code> of a type
parameter can fit this grammar:</p>
<pre tabindex="0"><code>K = type | lifetime | K -&gt; K
</code></pre><p>Note that this supports all kinds of crazy kinds, like <code>I&lt;_&lt;_&gt;&gt;</code>,
which would be <code>(type -&gt; type) -&gt; type</code>. This is like a <code>Foo</code> that is
not parameterized by another type, but rather by a type <em>constructor</em>,
so one would not write <code>Foo&lt;i32&gt;</code>, but rather <code>Foo&lt;Vec&lt;_&gt;&gt;</code>. Wow,
meta.</p>
<p>Note that everything here assumes that if you have a type constructor
<code>I</code> of kind <code>type -&gt; type</code>, we can apply <code>I</code> to any type. There&rsquo;s no
way to say &ldquo;types that are hashable&rdquo;. In later posts, I hope to dig
into this a bit more, and show that HRTB (and traits) can provide us a
means to express things like that.</p>
<h3 id="decidability-and-inference">Decidability and inference</h3>
<p>So you may have noticed that, in the previous paragraph, I was making
all kinds of analogies to higher-kinded types being like
functions. And certainly you can imagine defining &ldquo;general type
lambdas&rdquo;, so that if you have a type parameter of kind <code>type -&gt; type</code>,
you could supply <em>any</em> kind of function which, given one type, yields
another. But it turns out this is likely not what we want, for a
couple of reasons:</p>
<ol>
<li>It doesn&rsquo;t actually express what we wanted.</li>
<li>It makes inference imposssible.</li>
</ol>
<p>To get some intuition here, Let&rsquo;s go back to our first example:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">floatify_hkt</span><span class="o">&lt;</span><span class="n">I</span><span class="o">&lt;</span><span class="n">_</span><span class="o">&gt;&gt;</span><span class="p">(</span><span class="n">ints</span>: <span class="kp">&amp;</span><span class="nc">I</span><span class="o">&lt;</span><span class="kt">i32</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">I</span><span class="o">&lt;</span><span class="kt">f32</span><span class="o">&gt;</span><span class="w">
</span></span></span></code></pre></div><p>Here, <code>I</code> is declared a parameter of kind <code>type -&gt; type</code>. Now remember
that our intention was to say that these two parameters were the same
&ldquo;sort&rdquo; of collection (e.g., we take/return a <code>Vec&lt;i32&gt;/Vec&lt;f32&gt;</code> or a
<code>List&lt;i32&gt;/List&lt;f32&gt;</code>, but not a <code>Vec&lt;i32&gt;/List&lt;f32&gt;</code>). If however <code>I</code>
can be <em>any</em> &ldquo;type lambda&rdquo;, then <code>I</code> could be a lambda that returns
<code>Vec&lt;i32&gt;</code> if given an <code>i32</code>, and <code>List&lt;f32&gt;</code> is given an <code>f32</code>. We
might imagine pseudo-code that uses <code>if</code> and talks about types, like
this:</p>
<pre tabindex="0"><code>type I&lt;T&gt; = if T == i32 { Vec&lt;i32&gt; } else { List&lt;f32&gt; };
</code></pre><p>At this point, if you&rsquo;ve been carefully reading along, this should be
striking a memory. This sounds a lot like our first attempt at family
traits from the <a href="https://smallcultfollowing.com/babysteps/
/blog/2016/11/03/associated-type-constructors-part-2-family-traits/">previous post</a>! Let&rsquo;s go back in time to that
first take on <code>floatify_family()</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">floatify_family</span><span class="o">&lt;</span><span class="n">F</span><span class="o">&gt;</span><span class="p">(</span><span class="n">ints</span>: <span class="kp">&amp;</span><span class="nc">F</span>::<span class="n">Collection</span><span class="o">&lt;</span><span class="kt">i32</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">F</span>::<span class="n">Collection</span><span class="o">&lt;</span><span class="kt">f32</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="n">F</span>: <span class="nc">CollectionFamily</span><span class="w">
</span></span></span></code></pre></div><p>Basically here the <code>F</code> is playing <em>exactly</em> this &ldquo;type lambda&rdquo; role.
<code>F::Collection&lt;T&gt;</code> is the same as <code>I&lt;T&gt;</code>. Moreover, using impls and
traits, we can write arbitrary, <a href="https://www.reddit.com/r/rust/comments/2o6yp8/brainfck_in_rusts_type_system_aka_type_system_is/">turing-complete</a> functions on types!</p>
<p>(Note: it sounds like being turing-complete is hard; it&rsquo;s not. It&rsquo;s
actually hard to <em>avoid</em> once you start adding in any reasonably
expressive system. You essentially have to add some special-cases and
limitations to do it.)</p>
<p>This implies that if we permit higher-kinded type parameters like <code>I</code>
to be mapped to just any old kind of &ldquo;type lambda&rdquo;, our inference is
going to get stuck. So whenever you called <code>floatify_hkt()</code>, you would
need to explicitly annotate the &ldquo;type lambda&rdquo; <code>I</code>. Note that this is
worse than something like <code>collect()</code>, where all we need to know is
what the return type is, and we can figure everything out. Here, even
if we know the argument/return types, we can&rsquo;t figure out the
<em>function</em> that maps between them, at least not uniquely.</p>
<p>As an analogy, it&rsquo;d be like if I told you &ldquo;ok, so <code>f(1) = 2</code> and <code>f(2) = 3</code>, what is the function <code>f</code>?&rdquo;. Naturally there is no unique
answer. You might think that the answer is <code>f(x) = 1 + x</code>, and that
does fit the data, but of course that&rsquo;s not the only answer. It could
also be <code>f(x) = min(x + 1, 10)</code>, and so forth.</p>
<h3 id="limiting-higher-kinded-types-via-currying-like-haskell">Limiting higher-kinded types via currying, like Haskell</h3>
<p>The way that Haskell solves this problem is by <strong>limiting</strong>
higher-kinded types. In particular, they say that a higher-kinded type
has to be (the equivalent of) a <code>struct</code> or <code>enum</code> name with some
<em>suffix</em> of parameters left blank.</p>
<p>So that means that if you have a kind like <code>type -&gt; type</code>, it could be
satisfied with <code>Vec&lt;_&gt;</code> or <code>Result&lt;i32, _&gt;</code>, but not <code>Result&lt;_, i32&gt;</code>
and certainly not some more complex function. It also means that if
you have aliases (like <code>type PairVec&lt;T&gt; = Vec&lt;(T, T)&gt;</code> in Rust), you
can&rsquo;t make an HKT from <code>PairVec&lt;_&gt;</code>.</p>
<p>This scheme has a lot of advantages! In particular, let&rsquo;s go back to
our type inference problem. As you recall, the fundamental kind of
constraint we end up with is type equalities. In that case, we wind up
knowing the &ldquo;inputs&rdquo; to a HKT and the &ldquo;output&rdquo;. So I might have
something like:</p>
<pre><code>?1&lt;?2&gt; = Result&lt;i32, u32&gt;
</code></pre>
<p>Since <code>?1</code> can&rsquo;t be just any function, I can uniquely determine that
<code>?1 = Result&lt;i32, _&gt;</code> and <code>?2 = u32</code>. There is just nothing else it
could be!</p>
<p>(This scheme is called <em>currying</em> in Haskell and it&rsquo;s actually really
quite elegant, at least in terms of how it fits into the whole
abstract language. It&rsquo;s basically a universal principle in Haskell
that any sort of &ldquo;function&rdquo; can be converted into a lambda by leaving
off a suffix of its parameters. I won&rsquo;t say more because (a)
converting the examples into Rust syntax doesn&rsquo;t really give you as
good a feeling for its elegance and (b) this post is long enough
without explaining Haskell too!)</p>
<p>In fact, we can go even futher. Imagine that we have an equality like
this, where we don&rsquo;t really know much at all about either side:</p>
<pre><code>?1&lt;?2&gt; = ?3&lt;?4&gt;
</code></pre>
<p>Even here, we can make progress, because we can infer that <code>?1 = ?3</code>
and <code>?2 = ?4</code>. This is a pretty strong and useful property.</p>
<h3 id="problems-with-currying-for-rust">Problems with currying for Rust</h3>
<p>So there are a couple of reasons that a currying approach wouldn&rsquo;t
really be a good fit for Rust. For one thing, it wouldn&rsquo;t fit the <code>&amp;</code>
&ldquo;type constructor&rdquo; very well. If you think of types like <code>&amp;'a T</code>, you
effectively have a type of kind <code>lifetime -&gt; type -&gt; type</code> (well, not
exactly; the type <code>T</code> must outlive <code>'a</code>, giving rise to the same
matter of constrained types I raised earlier, but this problem is not
unique to <code>&amp;</code> and applies to most any generic Rust type). Essentially,
give me a lifetime (<code>'a</code>) and a type (<code>T</code>) and I will give you a new
combined type <code>&amp;'a T</code>. OK, so far so good, but if we follow a
currying-based approach, then this means that you can partially apply
<code>&amp;</code> to a particular lifetime (<code>'a</code>), yielding a HKT like <code>type -&gt; type</code>. This is good for those cases where you wish to treat <code>&amp;'a T</code>
interchangeably with other pointer-like types, such as <code>Rc&lt;T&gt;</code>.</p>
<p>But then there are times like <code>Iterable</code>, where you might like to be
able to take a base type like <code>&amp;'a T</code> and plugin other lifetimes to
get <code>&amp;'b T</code>. In other words, you might want <code>lifetime -&gt; type</code>. But
using a Haskell-like currying approach you basically have to pick one
or the other.</p>
<p>Another problem with currying is that you always have to leave a
<em>suffix</em> of type parameters unapplied, and that is just (in practice)
unlikely to be a good choice in Rust. Imagine we wanted to use a
map-like type parameter <code>M&lt;_,_&gt;</code>, so that (say) we could take in a
<code>M&lt;i32, T&gt;</code> and convert it to a map <code>M&lt;f32, T&gt;</code> of the same basic
kind. Now consider the definition of <code>HashMap</code>, which actually has
three parameters (one of which is defaulted):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">HashMap</span><span class="o">&lt;</span><span class="n">K</span><span class="p">,</span><span class="w"> </span><span class="n">V</span><span class="p">,</span><span class="w"> </span><span class="n">S</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">RandomState</span><span class="o">&gt;</span><span class="w">
</span></span></span></code></pre></div><p>We would have wanted <code>M = HashMap&lt;_, _, S&gt;</code>, but we can&rsquo;t do that,
because that&rsquo;s a <em>prefix</em> of the types we need, not a <em>suffix</em>.</p>
<p>One strategy that might work ok in practice is to say, in Rust, you
can name a HKT by putting an <code>_</code> on some <em>prefix</em> of the parameters
<em>for any given kind</em>. So e.g. we can do the following:</p>
<ul>
<li><code>&amp;'a T</code> yielding <code>type</code></li>
<li><code>&amp;'_ T</code> yielding <code>lifetime -&gt; type</code></li>
<li><code>&amp;'a _</code> yielding <code>type -&gt; type</code></li>
<li><code>Ref&lt;'_, T&gt;</code> yielding <code>lifetime -&gt; type</code></li>
<li><code>Ref&lt;'a, _&gt;</code> yielding <code>type -&gt; type</code></li>
<li><code>HashMap&lt;_, i32, S&gt;</code> yielding <code>type -&gt; type</code> (where the first <code>type</code> is the key)</li>
<li><code>HashMap&lt;_, _, S&gt;</code> yielding <code>type -&gt; type -&gt; type</code></li>
<li><code>Result&lt;_, Err&gt;</code> yielding <code>type -&gt; type</code></li>
</ul>
<p>but we could not do any of these, because in each case the <code>_</code> is not a prefix:</p>
<ul>
<li><code>Foo&lt;'a, '_, i32&gt;</code></li>
<li><code>HashMap&lt;i32, _, S&gt;</code></li>
</ul>
<p>Obviously it&rsquo;s unfortunate that the <code>_</code> would have to be a prefix, but
that&rsquo;s basically a necessary limitation to support type inference. If
you permitted <code>_</code> to appear anywhere, then only the most basic
constraints become solveable &ndash; essentially in <strong>practice</strong> you wind
up with a scenario where <strong>all type parameters</strong> must become <code>_</code>, and
partial application never works. To see what I mean, consider some
examples:</p>
<ul>
<li><code>?T&lt;?U&gt; = Rc&lt;i32&gt;</code>, solvable:
<ul>
<li>could be <code>?T = Rc&lt;_&gt;, ?U = i32</code></li>
<li>but not that this only works because <em>all</em> type parameters of <code>Rc</code> were made into <code>_</code></li>
</ul>
</li>
<li><code>?T&lt;?U&gt; = Result&lt;i32, u32&gt;</code>, unsolvable:
<ul>
<li>could be <code>?T = Result&lt;_, u32&gt;, ?U = i32</code></li>
<li>could be <code>?T = Result&lt;i32, _&gt;, ?U = u32</code></li>
</ul>
</li>
<li><code>?T&lt;?U, ?V&gt; = Result&lt;i32, u32&gt;</code>, solvable if we assume that ordering must be respected:
<ul>
<li>could be <code>?T = Result&lt;_, _&gt;, ?U = i32, ?V = u32</code></li>
<li>again, this only works because <em>all</em> type parameters of <code>Result</code> were made into <code>_</code></li>
</ul>
</li>
</ul>
<p>So, essentially, choosing a prefix (or suffix) is actually <em>more</em>
expressive in practice than allowing <code>_</code> to go anywhere, since the
latter would cripple inference and require manual type annotation.</p>
<p>So, what do you do if you&rsquo;d like to be able to put the <code>_</code> anywhere?
Say, because you want the choice of <code>Result&lt;_, E&gt;</code> or <code>Result&lt;T, _&gt;</code>? The answer is that
you build &ldquo;wrapper types&rdquo;, like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Unresult</span><span class="o">&lt;</span><span class="n">E</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">result</span>: <span class="nb">Result</span><span class="o">&lt;</span><span class="n">T</span><span class="p">,</span><span class="w"> </span><span class="n">E</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>(Note that this won&rsquo;t work with a plain type alias, as you can&rsquo;t
partially apply a type alias.  This is precisely because <code>Unresult&lt;X, Y&gt;</code> is a <em>distinct type</em> from <code>Result&lt;Y, X&gt;</code>, which is not the case
with a type alias.)</p>
<p>I find this kind of interesting because it starts to resemble the
&ldquo;dummy&rdquo; types that we made for families. But this is not really a
&ldquo;win&rdquo; for ATC or family traits in particular. After all, you only need
said dummy types when the default order isn&rsquo;t working for you; and, if
you wanted to make one collection type (like <code>List&lt;T&gt;</code>) participate in
two different collection families, you&rsquo;d need a wrapper there too.</p>
<h3 id="side-note-alternatives-to-currying">Side note: Alternatives to currying</h3>
<p>I am not sure of the full space of alternatives here.</p>
<p>For example, it may be possible to permit higher-kinded types to be
assigned to more complex functions, but only if the user provides
explicit type hints in those cases. This would be perhaps analogous to
higher-ranked types in Haskell, which sometimes require a certain
amount of type annotation since they can&rsquo;t be fully inferred in
general.</p>
<p>Another fruitful area to explore is the branch of logic programming,
where this sort of inference is referred to as <strong>higher-order
unification</strong> &ndash; basically, solving unification problems where you
have variables that are functions. Unsurprisingly, unrestricted
higher-order unification is a pretty thorny problem, lacking most of
the nice properties of first-order unification. For example, there can
be an infinite number of solutions, none of which is more general than
the other; in fact, in general, it&rsquo;s not even decidable whether there
<em>is</em> a solution or not!</p>
<p>Now, none of this means that there don&rsquo;t exist algorithms for solving
higher-order unification. In particular, there is a core solution
called Huet&rsquo;s algorithm; it&rsquo;s just that it is not guaranteed to
terminate and may generate an infinite number of
solutions. Nonetheless, in some settings it can work quite well.</p>
<p>There is also a <em>subset</em> of the higher-order unification called
<strong>higher-order pattern matching</strong>. In this subset, if I understand
correctly, we can solve unification constraints with higher-kinded
variables, but only if they look like this:</p>
<pre><code>for&lt;T&gt; (?1&lt;T&gt; = U&lt;T&gt;)
</code></pre>
<p>The idea here is that we are constraining <code>?1&lt;T&gt;</code> to be equal to
<code>U&lt;T&gt;</code> <strong>no matter what <code>T</code> is</strong>. In this case, clearly, <code>?1</code> must be
equal to <code>U</code>.  Apparently, this subset appears often in higher-order
logic programming languages like Lambda Prolog, but sadly it doesn&rsquo;t
seem that relevant to Rust.</p>
<h3 id="conclusions">Conclusions</h3>
<p>This concludes our first little tour of what HKT is, and what it might
mean for Rust. Here is a little summary of the some of the highlights:</p>
<ul>
<li>Higher-kinded types let you use a <em>type constructor</em> as a parameter;
so you might have a parameter declared like <code>I&lt;_&gt;</code> whose value is <code>Vec&lt;_&gt;</code>;
that is, the <code>Vec</code> type without specifying a particular kind of element.</li>
<li>Higher-ranked trait bounds (which Haskell doesn&rsquo;t offer, but which
are part of the <a href="https://github.com/rust-lang/rfcs/pull/1598">ATC RFC</a>) permit functions to declare
something like &ldquo;<code>I&lt;T&gt;</code> is a collection of <code>T</code> elements, regardless
of what <code>T</code> is&rdquo;.
<ul>
<li>Otherwise, you have to have a series of constraints like <code>I&lt;i32&gt;: Collection&lt;i32&gt;</code>,
<code>I&lt;f32&gt;: Collection&lt;f32&gt;</code>.</li>
<li>This can reveal implementation details you might prefer to hide.</li>
</ul>
</li>
<li>What Haskell <em>does</em> offer as an alternative is traits whose <code>Self</code> type is higher-kinded.
<ul>
<li>However, because HKT in Haskell do not permit where-clauses or conditions,
such a trait would not be usable for collections that impose limitations on
their element types (i.e., basically all collections):
<ul>
<li><code>BitSet</code> might require that the element is a <code>usize</code>;</li>
<li><code>HashSet&lt;T&gt;</code> requires <code>T: Hash</code>;</li>
<li><code>BTreeSet&lt;T&gt;</code> requires <code>T: Ord</code>;</li>
<li>heck, even <code>Vec&lt;T&gt;</code> requires <code>T: Sized</code> in Rust!</li>
</ul>
</li>
<li>Thus, a tradeoff is born between multi-parameter type classes, which permit such
conditions, and type classes based around higher-kinded types.</li>
<li>To be usable in Rust, we would have to extend the concept of HKT to include where clauses,
since almost <strong>all</strong> Rust types include some condition, even if only <code>T: Sized</code>.</li>
<li>Note that <em>collection families</em> naturally permit one to apply where conditions and side clauses.</li>
</ul>
</li>
<li>Higher-kinded types in Haskell are limited to a &ldquo;curried type declaration&rdquo;:
<ul>
<li>This makes type inference tractable and feels natural in Haskell.</li>
<li>Exporting this scheme to Rust feels awkward.</li>
<li>One thing that might work is that one can omit a <em>prefix</em> of the type parameters
of any given kind.</li>
</ul>
</li>
</ul>
<p>OK, that&rsquo;s enough for one post! In the next post, I plan to tackle the
following question:</p>
<ul>
<li>What is the difference between an &ldquo;associated type constructor&rdquo; and
an &ldquo;associated HKT&rdquo;?
<ul>
<li><strong>What might it mean to unify those two worlds?</strong></li>
</ul>
</li>
</ul>
<h3 id="comments">Comments</h3>
<p>Please leave comments on
<a href="https://internals.rust-lang.org/t/blog-post-series-alternative-type-constructors-and-hkt/4300">this internals thread</a>.</p>
<h4 id="footnotes">Footnotes</h4>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>I was not able to find an even-mildly-convincing variant on <code>floatify</code> for this. =)&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>This problem is not specific to types declared in fn bodies;
one can easily construct similar examples where it would be
necessary to make private structs public.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/atc" term="atc" label="ATC"/><category scheme="https://smallcultfollowing.com/babysteps/categories/hkt" term="hkt" label="HKT"/><category scheme="https://smallcultfollowing.com/babysteps/categories/traits" term="traits" label="Traits"/></entry><entry><title type="html">Associated type constructors, part 2: family traits</title><link href="https://smallcultfollowing.com/babysteps/blog/2016/11/03/associated-type-constructors-part-2-family-traits/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2016/11/03/associated-type-constructors-part-2-family-traits/</id><published>2016-11-03T00:00:00+00:00</published><updated>2016-11-03T00:00:00+00:00</updated><content type="html"><![CDATA[<p>Hello. This post is a continuation of my posts discussing the topic of
associated type constructors (ATC) and higher-kinded types (HKT):</p>
<ol>
<li><a href="https://smallcultfollowing.com/babysteps/
/blog/2016/11/02/associated-type-constructors-part-1-basic-concepts-and-introduction/">The first post</a> focused on introducing the basic idea of
ATC, as well as introducing some background material.</li>
<li>This post talks about some apparent limitations of associated type
constructors, and shows how we can overcome them by making use of a
design pattern that I call &ldquo;family traits&rdquo;. Along the way, we
introduce the term <strong>higher-kinded type</strong> for the first time, and
show (informally) that family traits are equally general.</li>
</ol>
<!-- more -->
<h3 id="the-limits-of-associated-type-constructors">The limits of associated type constructors</h3>
<p>OK, so in the last post we saw how we can use ATC to define a
<code>Collection</code> trait, and how to implement that trait for our sample
collection <code>List&lt;T&gt;</code>.  In particular, ATC let us express the return
type of the <code>iterator()</code> method as <code>Self::Iter&lt;'iter&gt;</code>, so that we can
incorporate the lifetime <code>'iter</code> of each particular iterator.</p>
<p>What I&rsquo;d like to do now is to go one step further &ndash; what if I wanted
to write a function that converts a collection of integers into a
collection of floats. Something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">floatify</span><span class="o">&lt;</span><span class="n">I</span><span class="p">,</span><span class="w"> </span><span class="n">F</span><span class="o">&gt;</span><span class="p">(</span><span class="n">ints</span>: <span class="kp">&amp;</span><span class="nc">I</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">F</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="n">I</span>: <span class="nc">Collection</span><span class="o">&lt;</span><span class="kt">i32</span><span class="o">&gt;</span><span class="p">,</span><span class="w"> </span><span class="n">F</span>: <span class="nc">Collection</span><span class="o">&lt;</span><span class="kt">f32</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">floats</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">F</span>::<span class="n">empty</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="w"> </span><span class="o">&amp;</span><span class="n">f</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="n">c</span><span class="p">.</span><span class="n">iterate</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">floats</span><span class="p">.</span><span class="n">add</span><span class="p">(</span><span class="n">f</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="kt">f32</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">floats</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This code would work just fine, but it has some interesting properties
that we may not have expected. In particular, <code>floatify()</code> can convert
any collection of integers into <em>any</em> collection of floats, but those
collections can be of <strong>totally different types</strong>. For example, I
could convert from a <code>List&lt;i32&gt;</code> to a <code>Vec&lt;f32&gt;</code> like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="p">(</span><span class="n">x</span>: <span class="kp">&amp;</span><span class="nc">List</span><span class="o">&lt;</span><span class="kt">i32</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">f32</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">y</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="kt">f32</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">floatify</span><span class="p">(</span><span class="n">x</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//     ^^^^^^^^ notice the type annotation
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">y</span><span class="p">.</span><span class="n">iterate</span><span class="p">().</span><span class="n">sum</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This is more flexible, which is good, but also has some downsides.
For example, that same flexibility can make type inference harder. To
see what I mean, imagine that I wanted to remove the <code>Vec&lt;f32&gt;</code> type
annotation from the variable <code>y</code>, like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="p">(</span><span class="n">x</span>: <span class="kp">&amp;</span><span class="nc">List</span><span class="o">&lt;</span><span class="kt">i32</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">f32</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">floatify</span><span class="p">(</span><span class="n">x</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//  ^ error: type not constrained!
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">y</span><span class="p">.</span><span class="n">iterate</span><span class="p">().</span><span class="n">sum</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This would not compile, because we don&rsquo;t have enough information to
figure out the type of <code>y</code>! In particular, we know that <code>y</code> is a
&ldquo;collection of <code>f32</code> values&rdquo;, but we don&rsquo;t know what <em>kind</em> of
collection. It is a <code>Vec&lt;f32&gt;</code> or <code>List&lt;f32&gt;</code>? Obviously it makes a
difference to the semantics of our code, since vectors add items onto
the end, and lists add things onto the beginning, so the order of
iterator is going to be different (and, since these are floats and
hence <code>+</code> is not actually commutative, that implies the <code>sum</code> may well
be different). So the compiler doesn&rsquo;t want to just <em>guess</em>.</p>
<p>So maybe we&rsquo;d like to say that <code>floatify</code> takes and returns a
collection of the same type. It turns out we can&rsquo;t do that with just
the <code>Collection</code> trait we&rsquo;ve seen so far. Essentially, the signature
that we would <em>want</em> is maybe something like this (ignoring the where
clauses for now):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">floatify_hkt</span><span class="o">&lt;</span><span class="n">I</span><span class="o">&gt;</span><span class="p">(</span><span class="n">ints</span>: <span class="kp">&amp;</span><span class="nc">I</span><span class="o">&lt;</span><span class="kt">i32</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">I</span><span class="o">&lt;</span><span class="kt">f32</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">//                        ^^^^^^ wait up, what is `I` here?
</span></span></span></code></pre></div><p>But woah, what is this <code>I</code> thing here? It&rsquo;s not a type parameter in
the normal sense, since it doesn&rsquo;t represent a type like <code>Vec&lt;i32&gt;</code> or
<code>List&lt;i32&gt;</code>. Instead it represents a kind of &ldquo;partial type&rdquo;, like
<code>Vec</code> or <code>List</code>, where the the element type is not yet specified. Or,
as type theorists like to call it, a &ldquo;higher-kinded type&rdquo; (HKT). I&rsquo;ll
get into why it&rsquo;s called that, and more about how such a thing might
work, in the next post. For this post, I want to focus on an
alternative solution, one that doesn&rsquo;t require HKT at all.</p>
<h3 id="introducing-type-families">Introducing type families</h3>
<p>So let&rsquo;s assume that type parameters still just represent plain old
types &ndash; in that case, is it possible to write a version of
<code>floatify()</code> that returns a collection of the same &ldquo;sort&rdquo; as its
input?</p>
<p>It turns out you can do it, but you need an extra trait. We already
saw the <code>Collection</code> trait before; we&rsquo;d want to add a second trait,
let&rsquo;s call it <code>CollectionFamily</code>, that lets us go from a &ldquo;collection
family&rdquo; (e.g., <code>Vec</code>) to a specific collection (e.g., <code>Vec&lt;T&gt;</code>):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">CollectionFamily</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Member</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span>: <span class="nc">Collection</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>A &ldquo;collection family&rdquo; corresponds to a &lsquo;family&rsquo; of collections, like
<code>Vec</code> or <code>List</code>. We&rsquo;re also going to need then some dummy types to use for implementing
this trait:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">VecFamily</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">CollectionFamily</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">VecFamily</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Member</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Vec</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">ListFamily</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">CollectionFamily</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">ListFamily</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Member</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">List</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p><em>Note:</em> While writing this post I realized that Haskell also has a
feature called &ldquo;associated type families&rdquo;. Those are certainly related
to the things I am talking about here, but I am not trying to model
that Haskell feature, and my use of the term &ldquo;family&rdquo; is independent.</p>
<h3 id="families-and-inference">Families and inference</h3>
<p>OK, so now we have the idea of a &ldquo;collection family&rdquo;. You might think
then that we can now rewrite <code>floatify</code> like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">floatify_family</span><span class="o">&lt;</span><span class="n">F</span><span class="o">&gt;</span><span class="p">(</span><span class="n">ints</span>: <span class="kp">&amp;</span><span class="nc">F</span>::<span class="n">Collection</span><span class="o">&lt;</span><span class="kt">i32</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">F</span>::<span class="n">Collection</span><span class="o">&lt;</span><span class="kt">f32</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="n">F</span>: <span class="nc">CollectionFamily</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">floats</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">F</span>::<span class="n">Coll</span>::<span class="n">empty</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="w"> </span><span class="o">&amp;</span><span class="n">f</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="n">c</span><span class="p">.</span><span class="n">iterate</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">floats</span><span class="p">.</span><span class="n">add</span><span class="p">(</span><span class="n">f</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="kt">f32</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">floats</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Whereas before the type parameters represented specific collection
types, now we take a type parameter <code>F</code> that represents an entire
<em>family</em> of collection types. Then we can can use <code>F::Collection&lt;i32&gt;</code> to
name &ldquo;the collection in the family <code>F</code> whose item type is <code>i32</code>&rdquo;.</p>
<p>This type signature for <code>floatify_family()</code> works, but let&rsquo;s see what
happens now for our caller:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="p">(</span><span class="n">x</span>: <span class="kp">&amp;</span><span class="nc">List</span><span class="o">&lt;</span><span class="kt">i32</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">f32</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">floatify_family</span>::<span class="o">&lt;</span><span class="n">ListFamily</span><span class="o">&gt;</span><span class="p">(</span><span class="n">x</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//                        ^^^^^^^^^^ wait, what?
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">y</span><span class="p">.</span><span class="n">iterate</span><span class="p">().</span><span class="n">sum</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>It turns out that there is good and bad news. The good news is that,
once we know the family, we can indeed infer the type of <code>y</code>. The bad
news is that, at least with the setup we have so far, we can&rsquo;t
actually infer the type of the family! That is, the
<code>floatify_family::&lt;ListFamily&gt;</code> annotation turns out to be required!
To see why, let&rsquo;s look again at the signature of <code>floatify_family()</code></p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">floatify_family</span><span class="o">&lt;</span><span class="n">F</span><span class="o">&gt;</span><span class="p">(</span><span class="n">ints</span>: <span class="kp">&amp;</span><span class="nc">F</span>::<span class="n">Collection</span><span class="o">&lt;</span><span class="kt">i32</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">F</span>::<span class="n">Collection</span><span class="o">&lt;</span><span class="kt">f32</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="n">F</span>: <span class="nc">CollectionFamily</span><span class="w">
</span></span></span></code></pre></div><p>As before, to infer the type of <code>F</code>, we going to replace <code>F</code> with an
inference variable <code>?F</code>, and then do some unification. So we can see
that the type of the <code>ints</code> argument will be something like this (here
I am using the fully qualified notation to make everything explicit):</p>
<pre><code>&lt;?F as CollectionFamily&gt;::Collection&lt;i32&gt;
</code></pre>
<p>We have to unify this with <code>List&lt;i32&gt;</code>. But this presents a bit of a
problem! Knowing the <em>value</em> of an associated type (<code>?F::Collection&lt;i32&gt;</code>)
doesn&rsquo;t really let us figure out what impl that associated type came
from (i.e., what <code>?F</code> is). After all, there could be other impls that
specify the same <code>Coll</code>.</p>
<h3 id="linking-collections-and-families">Linking collections and families</h3>
<p>To make inference work, then, we really need a &ldquo;backlink&rdquo; from
<code>Collection</code> to <code>CollectionFamily</code>. This lets us go from a specific
collection type to its family:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Collection</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Backlink to `Family`.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Family</span>: <span class="nc">CollectionFamily</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// as before:
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">empty</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">Self</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">add</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">value</span>: <span class="nc">Item</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">iterate</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Iter</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Iter</span>: <span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">=</span><span class="n">Item</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">CollectionFamily</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Member</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span>: <span class="nc">Collection</span><span class="o">&lt;</span><span class="n">T</span><span class="p">,</span><span class="w"> </span><span class="n">Family</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">Self</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now we could rewrite <code>floatify_family</code> like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">floatify_family</span><span class="o">&lt;</span><span class="n">C</span><span class="o">&gt;</span><span class="p">(</span><span class="n">ints</span>: <span class="kp">&amp;</span><span class="nc">C</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">C</span>::<span class="n">Family</span>::<span class="n">Member</span><span class="o">&lt;</span><span class="kt">f32</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="n">C</span>: <span class="nc">Collection</span><span class="o">&lt;</span><span class="kt">i32</span><span class="o">&gt;</span><span class="w"> </span><span class="c1">//    ^^^^^^^^^^^^^^^^^ another collection, in same family
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">    
</span></span></span></code></pre></div><p>This change will mean that we can write the call without any type
annotations:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="p">(</span><span class="n">x</span>: <span class="kp">&amp;</span><span class="nc">List</span><span class="o">&lt;</span><span class="kt">i32</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">f32</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">floatify_family</span><span class="p">(</span><span class="n">x</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//      ^^^^^^^^^^^^^^^ look ma, no annotations
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">y</span><span class="p">.</span><span class="n">iterate</span><span class="p">().</span><span class="n">sum</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>What will happen is that, at the call site, the inferencer will create
two type variables, <code>?C</code> and <code>?F</code>. From the argument types, we can
deduce that <code>?C = List&lt;i32&gt;</code>. Next, solving the constraint <code>?C: Collection&lt;i32, Family=?F&gt;</code> will allow us to deduce that <code>?F = ListFamily</code>. And hence we are all set.</p>
<h3 id="side-note-extending-higher-ranked-trait-bounds">Side-note: extending higher-ranked trait bounds</h3>
<p>There&rsquo;s one part of <a href="https://github.com/rust-lang/rfcs/pull/1598">RFC 1598</a> that I haven&rsquo;t covered so far. I just
want to mention it in passing; it&rsquo;ll become a bit more prominent in
later articles in this series. The RFC includes a generalization of
Rust&rsquo;s <em>higher-ranked trait bounds</em> to support generalization over
types. This actually occurs quite implicitly and naturally. To see
what I mean, consider the <code>CollectionFamily</code> trait:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">CollectionFamily</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Member</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span>: <span class="nc">Collection</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//              ^^^^^^^^^^^^^ what does this bound apply to?
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>In particular, consider the bound <code>Collection&lt;T&gt;</code> &ndash; this bound
applies to the type <code>Self::Member&lt;T&gt;</code>, but what is <code>T</code> here? The answer
is that <code>T</code> is a stand-in for &ldquo;any type&rdquo; (or, almost).</p>
<p>Currently, we have a notation for writing trait bounds that apply to
<em>any</em> lifetime. For example, <code>for&lt;'a&gt; T: Foo&lt;'a&gt;</code> means &ldquo;for any
lifetime <code>'a</code>, <code>T</code> implements <code>Foo&lt;'a&gt;</code>&rdquo;; you could also write <code>T: for&lt;'a&gt; Foo&lt;'a&gt;</code>, which is equivalent. This <code>'a</code> lifetime can also
appear as part of the type, so one might write <code>for&lt;'a&gt; &amp;'a T: Foo&lt;'a&gt;</code> (in this case, you can&rsquo;t move the <code>for&lt;'a&gt;</code> around, since it
brings the <code>'a</code> into scope).</p>
<p>(There are actually lots of interesting implementation questions
raised by HRTB, some of which we haven&rsquo;t fully worked through. I&rsquo;ve
got another series of blog posts on those, but I&rsquo;m going to leave that
aside for now.)</p>
<p>Anyway, this <code>for&lt;&gt;</code> notation is just what we need to handle our <code>Member&lt;T&gt;</code>
type, except that we need it to apply to types. Basically we want a
bound like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">for</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="bp">Self</span>::<span class="n">Member</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span>: <span class="nc">Collection</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w">
</span></span></span></code></pre></div><p>Meaning in English, &ldquo;for any type <code>T</code>, <code>Self::Member&lt;T&gt;</code> implements the
trait <code>Collection&lt;T&gt;</code>&rdquo;. Or, more naturally, &ldquo;<code>Member&lt;T&gt;</code> is always a collection,
no matter what <code>T</code> is&rdquo;.</p>
<p>(This is a simplification. Really, <code>T</code> must meet <em>some</em> requirements
&ndash; for example, it likely must be <code>Sized</code>. This is precisely the stuff
I want to get into in a later post, since our current implementation
doesn&rsquo;t handle these kinds of requirements as gracefully as it
should/could.)</p>
<h3 id="families-vs-hkt">Families vs HKT</h3>
<p>It should be clear that the &ldquo;collection families&rdquo; I introduced in the
last section basically correspond to higher-kinded types, but made
more explicit. This shows that associated type constructors are indeed
a quite general tool. I am pretty sure that one can convert any
program using HKT to use associated type constructors, but of course
one must follow this family pattern.</p>
<p>One could view this as a problem: one could also view it a plus.
After all, associated type constructors are a tiny delta on the
language we have today, and yet we gain the full power of
HKT. Basically, teaching ATC isn&rsquo;t much harder than teaching Rust
today, and then we can just add the &ldquo;design pattern&rdquo; of families on
top &ndash; this may well be less intimidating than teaching &ldquo;HKT&rdquo; itself.
Maybe.</p>
<p>One nice part about avoiding &ldquo;true HKT&rdquo; is that we get to sidestep
some of the thorny questions that it raises. In particular, the
challenges that full HKT poses for inference. We&rsquo;ll come back to
those: it turns out that they are highly related to the problems we
had in families that prompted us to add a <code>Family</code> member to
<code>Collection</code>.</p>
<p>One big question, I think, is how often we would want to define these
sorts of &ldquo;family&rdquo; traits, and how it would <em>really</em> feel to use them
&ldquo;at scale&rdquo;. I can think of several places that families might make
sense. Let me just give a few examples of possible families.</p>
<h4 id="parameterizing-over-smart-pointers-and-thread-safety">Parameterizing over smart pointers and thread safety</h4>
<p>One thing I think people want to do from time to time is to
parameterize over <code>Rc</code> vs <code>Arc</code>. You might imagine having a
family like this for choosing between them:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">RefCountedFamily</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Ptr</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span>: <span class="nc">RefCounted</span><span class="o">&lt;</span><span class="n">T</span><span class="p">,</span><span class="w"> </span><span class="n">Family</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">Self</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">new</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="n">value</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Ptr</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">RefCounted</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span>: <span class="nc">Deref</span><span class="o">&lt;</span><span class="n">Target</span><span class="o">=</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Clone</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Family</span>: <span class="nc">RefCountedFamily</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>An example that could benefit from this is persistent collections like
<a href="https://github.com/michaelwoerister/hamt-rs">mw&rsquo;s hamt-rs</a> library,
which currently encodes <code>Arc</code>.</p>
<p>More generally, you might want to be able to map between patterns
types like <code>Rc&lt;Cell&lt;usize&gt;&gt;</code> or <code>Rc&lt;RefCell&lt;T&gt;&gt;</code> vs <code>Arc&lt;AtomicUsize&gt;</code>
or <code>Arc&lt;Rwlock&lt;T&gt;&gt;</code>; these are mostly equivalent, except that the
latter is thread-safe but more expensive.</p>
<h4 id="parameterizing-over-mutability">Parameterizing over mutability</h4>
<p>Another common thing is the need to be parameterized over <code>&amp;'a T</code> vs
<code>&amp;'a mut T</code>. Interestingly, I don&rsquo;t think that associated type
constructors (<em>or</em> HKT) really gives us that! The problem is that
borrow expressions operate on <em>paths</em>, and we have no way to reify
that distinction right now. Basically you can&rsquo;t make methods that
model the <code>&amp;</code> operator; interestingly, this problem is also a
limitation for modeling garbage collection in Rust. I&rsquo;ll try to get
into this in one of the later posts in the series, but it&rsquo;s an
interesting shortcoming I hadn&rsquo;t realized till trying to write out
this post.</p>
<h3 id="conclusions">Conclusions</h3>
<p>OK, in this post we covered a design pattern I call &ldquo;family traits&rdquo;,
that uses ATC to model HKT:</p>
<ul>
<li>Our original <code>Collection</code> trait let you iterate over existing collections,
but it didn&rsquo;t let you convert between types of collections;
<ul>
<li>in other words, if I have a type like <code>C: Collection&lt;i32&gt;</code>,
I couldn&rsquo;t get a type <code>D</code> where <code>D: Collection&lt;u32&gt;</code> that is guaranteed
to be the same &ldquo;sort&rdquo; of collection.</li>
</ul>
</li>
<li>&ldquo;Higher-kinded types&rdquo; are basically a way to make this notion more
formal, and refer to an &ldquo;unapplied generic&rdquo; like <code>Vec</code> or <code>List</code>.</li>
<li>We can model this relationship with ATC by defining a type like <code>VecFamily</code> or
<code>ListFamily</code> that is also unapplied, and then definiting a trait <code>CollectionFamily</code>.
<ul>
<li>For type inference reasons, we also need to be able to go from a
specific <code>Collection</code> type like <code>C</code> to its family (<code>C::Family</code>).</li>
</ul>
</li>
</ul>
<p>The next post will dig deeper into what higher-kinded types might look
like in Rust, and in particular we want to see if there&rsquo;s a way to
make them &ldquo;play nice&rdquo; with the <code>Collection&lt;T&gt;</code> trait we&rsquo;ve been
looking at.</p>
<h3 id="comments">Comments</h3>
<p>Please leave comments on
<a href="https://internals.rust-lang.org/t/blog-post-series-alternative-type-constructors-and-hkt/4300">this internals thread</a>.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/atc" term="atc" label="ATC"/><category scheme="https://smallcultfollowing.com/babysteps/categories/hkt" term="hkt" label="HKT"/><category scheme="https://smallcultfollowing.com/babysteps/categories/traits" term="traits" label="Traits"/></entry><entry><title type="html">Associated type constructors, part 1: basic concepts and introduction</title><link href="https://smallcultfollowing.com/babysteps/blog/2016/11/02/associated-type-constructors-part-1-basic-concepts-and-introduction/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2016/11/02/associated-type-constructors-part-1-basic-concepts-and-introduction/</id><published>2016-11-02T00:00:00+00:00</published><updated>2016-11-02T00:00:00+00:00</updated><content type="html"><![CDATA[<p>So for the end of last week, I was at Rust Belt Rust. This was
awesome.  And not only because the speakers and attendees at Rust Belt
Rust were awesome, though they were. But also because it gave aturon,
withoutboats, and I a chance to talk over a lot of stuff in person. We
covered a lot of territory and so I wanted to do a series of blog
posts trying to write down some of the things we were thinking so as
to get other people&rsquo;s input.</p>
<p>The first topic I&rsquo;m going to focus on is <a href="https://github.com/rust-lang/rfcs/pull/1598">RFC 1598</a>, which is a
proposal by withoutboats to add <strong>associated-type constructors</strong> (ATC)
to the language. ATC makes it possible to have &ldquo;generic&rdquo; associated
types, which in turn means we can support important patterns like
collection and iterable traits.</p>
<p>ATC also (as we will see) potentially subsumes the idea of
<strong>higher-kinded types</strong>. A big focus of our conversation was on
elaborating a potential alternative design based on HKT, and trying to
see whether choosing to add ATC would lock us into a suboptimal path.</p>
<p>This is quite a big topic, so I&rsquo;m going to spread it out over many
posts. <strong>This first post will introduce the basic idea of associated
type constructors. It also gives various bits of background
information on Rust&rsquo;s trait system and how type inference works. A
certain familiarity with Rust is expected, but expertise should not be
necessary.</strong></p>
<p><strong>Aside:</strong> Now higher-kinded types especially are one of those PL
topics that <strong>sound</strong> forebodingly complex and kind of abstract (like
monads). But once you learn what it is, you realize it&rsquo;s actually
relevant to your life (unlike monads). So I hope to break it down in a
relatively simple way.</p>
<p>(Oh, and I&rsquo;m just trolling about monads. Sorry, couldn&rsquo;t resist. Don&rsquo;t
hate me. We&rsquo;ll actually be talking about monads &ndash; well, more about
functors &ndash; in a few posts down the line.)</p>
<!-- more -->
<h3 id="background-traits-and-associated-types">Background: traits and associated types</h3>
<p>Before I can get to <a href="https://github.com/rust-lang/rfcs/pull/1598">RFC 1598</a>, let me lay out a bit of
background. This post is going to be talking a lot about
traits. Traits are Rust&rsquo;s version of a generic interface. Naturally,
these traits can define a bunch of methods that are part of the
interface, but they can also define <strong>types</strong> that are part of the
interface. We call these <strong>associated types</strong>. So, for example,
consider the <code>Iterator</code> trait:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This means that every implementation of <code>Iterator</code> must specify both a
<code>next()</code> method, which defines how we iterate, as well as the type
<code>Item</code>, which defines what kind of values this iterator produces.  The
two items are linked, since the return value of <code>next()</code> is
<code>Self::Item</code>.</p>
<p>The notation <code>Self::Item</code> means &ldquo;the <code>Item</code> defined in the impl for
the type <code>Self</code>&rdquo; &ndash; in other words, the <code>Item</code> type defined for this
iterator. This notation is actually shorthand for something more
explicit that spells out all the parts: <code>&lt;Self as Iterator&gt;::Item</code> &ndash;
here we are saying &ldquo;the <code>Item</code> type defined in the implementation of
<code>Iterator</code> for the type <code>Self</code>&rdquo;. (I prefer to call such paths &ldquo;fully
qualified&rdquo;, but in the past they have sometimes been called &ldquo;UFCS&rdquo; in
the Rust community; this stands for &ldquo;universal functional call
syntax&rdquo;, which is a term borrowed from D, where it unfortunately means
something totally different.)</p>
<p>So now we can use the iterator trait to write generic code. For
example, we could write a generic routine <code>position</code> that returns the
position.  I&rsquo;m going to write this code using a
<a href="https://doc.rust-lang.org/book/if-let.html#while-let"><code>while let</code></a>
loop instead of a <code>for</code> loop, so as to make the iterator protocol more
explicit:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">position</span><span class="o">&lt;</span><span class="n">I</span><span class="o">&gt;</span><span class="p">(</span><span class="k">mut</span><span class="w"> </span><span class="n">iterator</span>: <span class="nc">I</span><span class="p">,</span><span class="w"> </span><span class="n">value</span>: <span class="nc">I</span>::<span class="n">Item</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="kt">usize</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="n">I</span>: <span class="nb">Iterator</span><span class="p">,</span><span class="w"> </span><span class="n">I</span>::<span class="n">Item</span>: <span class="nb">Eq</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">index</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">while</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">v</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">iterator</span><span class="p">.</span><span class="n">next</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">if</span><span class="w"> </span><span class="n">value</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="n">v</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">return</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">index</span><span class="p">);</span><span class="w"> </span><span class="c1">// found it!
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">index</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nb">None</span><span class="w"> </span><span class="c1">// did not find it
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Take a look at the types in the signature there. The first argument,
<code>iterator</code> is of type <code>I</code>, which is a generic type parameter; the
where clause also declares that <code>I: Iterator</code>. So basically we just
know that <code>iterator</code>&rsquo;s type is &ldquo;some kind of iterator&rdquo;. The second
argument, <code>value</code>, has the type <code>I::Item</code> &ndash; this is also a kind of
generic type. We&rsquo;re saying that <code>value</code> is &ldquo;whatever kind of item
<code>I</code> produces&rdquo;. We could also write that in a slightly different
way, using two generic parameters:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">position</span><span class="o">&lt;</span><span class="n">I</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="k">mut</span><span class="w"> </span><span class="n">iterator</span>: <span class="nc">I</span><span class="p">,</span><span class="w"> </span><span class="n">value</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="kt">usize</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="n">I</span>: <span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">=</span><span class="n">T</span><span class="o">&gt;</span><span class="p">,</span><span class="w"> </span><span class="n">T</span>: <span class="nb">Eq</span>
</span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Here the <code>where</code> clause states that <code>I: Iterator&lt;Item=T&gt;</code>. This
means &ldquo;<code>I</code> is some sort of iterator producing values of type
<code>T</code>&rdquo;.</p>
<h3 id="running-example-linked-list-and-iterator">Running example: linked list and iterator</h3>
<p>OK, let&rsquo;s elaborate out an example that I can use throughout the post.
We&rsquo;ll start by defining a simple collection type, <code>List&lt;T&gt;</code>, that is a
kind of linked list:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="sd">/// Very simple linked list. If `cell` is `None`,
</span></span></span><span class="line"><span class="cl"><span class="sd">/// the list is empty.
</span></span></span><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">List</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">cell</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="nb">Box</span><span class="o">&lt;</span><span class="n">ListCell</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="sd">/// A single cell in a non-empty list. Stores one
</span></span></span><span class="line"><span class="cl"><span class="sd">/// value and then another list.
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">ListCell</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">value</span>: <span class="nc">T</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">next</span>: <span class="nc">List</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>We can define some customary methods on this list:</p>
<ul>
<li><code>new()</code> &ndash; returns an empty list;</li>
<li><code>prepend()</code> &ndash; insert a value on the front of the list, which is
usually best when working with singly linked lists with no
tail pointer;</li>
<li><code>iter()</code> &ndash; creates an iterator that yields up
<a href="http://intorust.com/tutorial/shared-borrows/">shared references</a> to
the items in the list.</li>
</ul>
<p>Here are some example implementations of those methods:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">List</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">new</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">List</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">List</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">cell</span>: <span class="nb">None</span> <span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">prepend</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">value</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// get ahold of the current head of the list, if any
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">old_head</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">cell</span><span class="p">.</span><span class="n">take</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// Create a new cell to serve as the new head of the list,
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// and then store it in `self.cell`.
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">cell</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ListCell</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">value</span>: <span class="nc">value</span><span class="p">,</span><span class="w"> </span><span class="n">next</span>: <span class="nc">old_head</span><span class="w"> </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">cell</span><span class="p">.</span><span class="n">next</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="nb">Box</span>::<span class="n">new</span><span class="p">(</span><span class="n">cell</span><span class="p">));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">iter</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="o">&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="na">&#39;iter</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">ListIter</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">ListIter</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">cursor</span>: <span class="nc">self</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Let&rsquo;s look more at this last method, and in particular let&rsquo;s look at
how we can define the iterator type <code>ListIter</code> (by the way, if you&rsquo;d
like to read up more on iterators and how they work, you might enjoy
<a href="https://smallcultfollowing.com/babysteps/
/blog/2016/02/19/parallel-iterators-part-1-foundations/">this old blog post of mine</a>, which walks through several
different kinds of iterators in more detail). The <code>ListIter</code> iterator
will basically hold a reference to a <code>List&lt;T&gt;</code>. At each step, if the
list is non-empty, it will return a reference to the <code>value</code> field and
then update the cursor to the next cell. That struct might look
something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="sd">/// Iterator over linked lists.
</span></span></span><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">ListIter</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">cursor</span>: <span class="kp">&amp;</span><span class="na">&#39;iter</span> <span class="nc">List</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The <code>'iter</code> lifetime here is the lifetime of the reference to our
list.  I called it <code>'iter</code> because the idea is that it lives as long
as the iteration is still ongoing (after that, we don&rsquo;t need it
anymore). Anyway, then we can implement the iterator <em>trait</em> like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">ListIter</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;iter</span><span class="w"> </span><span class="n">T</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;&amp;</span><span class="na">&#39;iter</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// If the list is non-empty, borrow a reference
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// to the cell (`cell`).
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="k">ref</span><span class="w"> </span><span class="n">cell</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">cursor</span><span class="p">.</span><span class="n">cell</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// Point the cursor at the next cell.
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="bp">self</span><span class="p">.</span><span class="n">cursor</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="n">cell</span><span class="p">.</span><span class="n">next</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// Return reference to the value in the
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// the current cell.
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nb">Some</span><span class="p">(</span><span class="o">&amp;</span><span class="n">cell</span><span class="p">.</span><span class="n">value</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// List is empty, return `None`.
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nb">None</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Here you see that the impl specifies the type <code>Item</code> to be <code>&amp;'iter T</code>. This is sort of interesting, because, in a sense, it&rsquo;s not really
telling us what the type is, since we don&rsquo;t yet know what lifetime
<code>'iter</code> is nor what type <code>T</code> is (it&rsquo;ll depend on what type of values
are in the list, of course). But there is a key point here &ndash; even
though the impl is generic, we know that given any particular type
<code>ListIter&lt;'a, Foo&gt;</code>, there is exactly one associated <code>Item</code> type (in
this case, <code>&amp;'a Foo</code>).</p>
<h3 id="background-the-role-of-type-inference">Background: The role of type inference</h3>
<p>Now that we&rsquo;ve seen the <code>List</code> example, I want to briefly go over the
role of type inference in doing trait matching. This will be very
important when we talk later about higher-kinded types. Imagine that I
have some code that uses a list like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">list</span><span class="p">(</span><span class="n">list</span>: <span class="kp">&amp;</span><span class="nc">List</span><span class="o">&lt;</span><span class="kt">u32</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">iter</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">list</span><span class="p">.</span><span class="n">iter</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">value</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">iter</span><span class="p">.</span><span class="n">next</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>So how does the compiler infer the type of the variable <code>value</code>? The
way that this works is by searching the declared impls. In particular,
in the call <code>iter.next()</code>, we know that the type of <code>iter</code> is
<code>ListIter&lt;'foo, u32&gt;</code> (for some lifetime <code>'foo</code>). We also know that
the method <code>next()</code> is part of the trait <code>Iterator</code> (actually,
figuring this out is a big job in and of itself, but I&rsquo;m going to
ignore that part of it for this post and just assume it is given). So
that tells us that we have to go searching for the <code>Iterator</code> impl
that applies to <code>ListIter</code>.</p>
<p>We do this, basically, by iterating over all the impls that we see and
try to match up the types with the one we are looking for.  Eventually
we will come to the <code>ListIter</code> impl we saw earlier; it looks like
this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">ListIter</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>So how do we relate these generic impl parameters (<code>'iter</code>, <code>T</code>) to
the type we have at hand <code>ListIter&lt;'foo, u32&gt;</code>? We do this by
replacing those parameters with &ldquo;inference variables&rdquo;, which I will
denote with a leading <code>?</code> &ndash; lifetime variables will be lower-case,
type variables up-ercase. So that means that the impl type looks like
something like <code>ListIter&lt;?iter, ?T&gt;</code>. We then try to figure out what
values of those variables will make the two types the same. In this
case, <code>?iter</code> will map to <code>'foo</code> and <code>?T</code> will map to <code>u32</code>.</p>
<p>Once we know how to map <code>?iter</code> and <code>?T</code>, we can look at the actual
signature of <code>next()</code> as declared in the impl and apply that same mapping:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// Signature as declared, written in a more explicit style:
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="bp">self</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">ListIter</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;&amp;</span><span class="na">&#39;iter</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// Signature with mapping applied
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="bp">self</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">ListIter</span><span class="o">&lt;</span><span class="na">&#39;foo</span><span class="p">,</span><span class="w"> </span><span class="kt">u32</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;&amp;</span><span class="na">&#39;foo</span><span class="w"> </span><span class="kt">u32</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span></code></pre></div><p>Now we can see that the type of <code>value</code> is the (mapped) return type of
this signature, and hence that it must be <code>Option&lt;&amp;'foo u32&gt;</code>. Very
good.</p>
<p>Some key points here:</p>
<ul>
<li>When doing trait selection, we replace the generic parameters
on the impl (e.g., <code>T</code>, <code>'iter</code>) with variables (<code>?T</code>, <code>?iter</code>).</li>
<li>We use unification to then figure out what those variables must be.</li>
</ul>
<h3 id="associated-type-constructors-the-iterable-trait">Associated type constructors: the iterable trait</h3>
<p>OK, so far we&rsquo;ve seen that we can define an <code>Iterator</code> trait that lets
us operate generically over iterators like <code>ListIter&lt;'iter, T&gt;</code>. That&rsquo;s very useful, but you might be wondering if it&rsquo;s possible
to define a <code>Collection</code> trait that lets us operate generically over
collections, like <code>List&lt;T&gt;</code>. Perhaps something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// Collection trait, take 1.
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Collection</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// create an empty collection of this type:
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">empty</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">Self</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// add `value` to this collection in some way:
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">add</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">value</span>: <span class="nc">T</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// iterate over this collection:
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">iterate</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Iter</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// the type of an iterator for this collection (e.g., `ListIter`)
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Iter</span>: <span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">=</span><span class="n">T</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>If we try to write an impl of this collection for <code>List&lt;T&gt;</code>, we will
find that it <em>almost</em> works, but not quite. Let&rsquo;s give it a try!</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Collection</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">List</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">empty</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">List</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">List</span>::<span class="n">new</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">        
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">add</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">value</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">prepend</span><span class="p">(</span><span class="n">value</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">iterate</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="o">&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="na">&#39;iter</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">ListIter</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">iter</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Iter</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ListIter</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//                   ^^^^^ oh, wait, this is not in scope!
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Everything seems to be going great until we get to the last item, the
associated type <code>Iter</code>. Then we see that we can&rsquo;t actually write out
the full type &ndash; that&rsquo;s because the full type needs to talk about the
lifetime <code>'iter</code> of the iteration, and that is not in scope at this
point. Remember that each call to <code>iterate()</code> will require a distinct
lifetime <code>'iter</code>.</p>
<p>This shows that in fact modeling <em>collections</em> is actually harder than
modeling <em>iterators</em>. Recall that, with iterators, we said that once
we know the type of an iterator, we know everything we need to know to
figure out the type of items that iterator produces. But with
<em>collections</em>, knowing the collection type (<code>List&lt;T&gt;</code>) does <strong>not</strong>
tell us everything we need to know to get the type of an iterator
(<code>ListIter&lt;'iter, T&gt;</code>).</p>
<p><a href="https://github.com/rust-lang/rfcs/pull/1598">RFC 1598</a> proposes to solve this problem by making it possible to
have not only <em>associated types</em> but associated type <strong>constructors</strong>.
Basically, associated types can themselves have generic type
parameters:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// Collection trait, take 2, using RFC 1598.
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Collection</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// as before
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">empty</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">Self</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">add</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">value</span>: <span class="nc">T</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Here, we use associated type constructors:
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">iterate</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="o">&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="na">&#39;iter</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Iter</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Iter</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="o">&gt;</span>: <span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">=</span><span class="n">T</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now, writing the impl of <code>Collection</code> for <code>List</code> becomes fairly
straightforward. In fact, the only difference is the definition of the
type <code>Iter</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Collection</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">List</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="c1">// same as above
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Iter</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ListIter</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//        ^^^^^ brings `&#39;iter` into scope
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>We could also imagine writing impls for other types, like <code>Vec&lt;T&gt;</code>
in the standard library:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">use</span><span class="w"> </span><span class="n">std</span>::<span class="n">slice</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Collection</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Vec</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">empty</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">Self</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[]</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">add</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">value</span>: <span class="nc">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="n">value</span><span class="p">);</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">iterate</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="o">&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="na">&#39;iter</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">slice</span>::<span class="n">Iter</span><span class="o">&lt;</span><span class="na">&#39;self</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">iter</span><span class="p">()</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Iter</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">slice</span>::<span class="n">Iter</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="writing-code-that-is-generic-over-collections">Writing code that is generic over collections</h3>
<p>Now that we have a collection trait, we can write code that works
generically over collections. That&rsquo;s pretty nifty. For example, this
function takes in a collection of floating point numbers and returns
to you another collection with the same numbers, but rounded down to
the nearest integer:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">round_all</span><span class="o">&lt;</span><span class="n">C</span><span class="o">&gt;</span><span class="p">(</span><span class="n">collection</span>: <span class="kp">&amp;</span><span class="nc">C</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">C</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="n">C</span>: <span class="nc">Collection</span><span class="o">&lt;</span><span class="kt">f32</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">rounded</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">C</span>::<span class="n">empty</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="w"> </span><span class="o">&amp;</span><span class="n">f</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="n">c</span><span class="p">.</span><span class="n">iterate</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">rounded</span><span class="p">.</span><span class="n">add</span><span class="p">(</span><span class="n">f</span><span class="p">.</span><span class="n">floor</span><span class="p">());</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">rounded</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="conclusion">Conclusion</h3>
<p>That&rsquo;s it for today. Let&rsquo;s review what we covered thus far:</p>
<ul>
<li>Traits today can define <strong>associated types</strong>;
<ul>
<li>but, this type cannot make use of any types or lifetimes that aren&rsquo;t
part of the implementing type</li>
</ul>
</li>
<li>Whenever you have something with <strong>generic parameters</strong>, like an <code>impl</code>, <code>fn</code>, or <code>struct</code>,
inference is used to determine the value of those parameters;
<ul>
<li>this means that if you try to extend the sorts of thing that a generic parameter can
be used to represent (such as permitting things that are generic over constants), you
have to think about how it will interact with inference.</li>
</ul>
</li>
<li>If you have a collection type like <code>List&lt;T&gt;</code>, the iterator usually
includes a lifetime <code>'iter</code> that is not part of the original type (<code>ListIter&lt;'iter, T&gt;</code>);
<ul>
<li>therefore, you cannot model a <code>Collection</code> trait today in Rust, at least not
in a nice way.
<ul>
<li>There are some tricks I didn&rsquo;t cover. =)</li>
</ul>
</li>
</ul>
</li>
<li><strong>Associated type constructors</strong> are basically just &ldquo;generic&rdquo; associated types;
<ul>
<li>this is great for modeling <code>Collection</code>.</li>
</ul>
</li>
</ul>
<h3 id="comments">Comments</h3>
<p>Please leave comments on
<a href="https://internals.rust-lang.org/t/blog-post-series-alternative-type-constructors-and-hkt/4300">this internals thread</a>.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/atc" term="atc" label="ATC"/><category scheme="https://smallcultfollowing.com/babysteps/categories/hkt" term="hkt" label="HKT"/><category scheme="https://smallcultfollowing.com/babysteps/categories/traits" term="traits" label="Traits"/></entry><entry><title type="html">Switching to Jekyll</title><link href="https://smallcultfollowing.com/babysteps/blog/2016/11/01/switching-to-jekyll/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2016/11/01/switching-to-jekyll/</id><published>2016-11-01T00:00:00+00:00</published><updated>2016-11-01T12:27:00-04:00</updated><content type="html"><![CDATA[<p>If you visit the site today, you&rsquo;ll notice it looks quite a bit
different.  I&rsquo;ve decided to switch from my old antiquated Octopress to
a plain Jekyll-based one. The most immediate of this is that Rust code
highlighting looks much better, and I get access to modern
Github-flavored markdown. =) Since I understand plain Jekyll a bit
more, I&rsquo;ll hopefully also be able to customize the appearance somewhat
&ndash; but for now I&rsquo;m just going with the basic theme.</p>
<p>If I&rsquo;ve done everything right &ndash; unlikely &ndash; then all the old links
will still work. Let me know if you notice anything amiss!</p>
]]></content></entry><entry><title type="html">Supporting blanket impls in specialization</title><link href="https://smallcultfollowing.com/babysteps/blog/2016/10/24/supporting-blanket-impls-in-specialization/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2016/10/24/supporting-blanket-impls-in-specialization/</id><published>2016-10-24T00:00:00+00:00</published><updated>2016-10-24T13:42:24-04:00</updated><content type="html"><![CDATA[<p>In my <a href="https://smallcultfollowing.com/babysteps/
/blog/2016/09/29/distinguishing-reuse-from-override/">previous post</a>, I talked about how we can separate out
specialization into two distinct concepts: <strong>reuse</strong> and <strong>override</strong>.
Doing so makes because the conditions that make reuse possible are
more stringent than those that make override possible. <strong>In this post,
I want to extend this idea to talk about a new rule for specialization
that allow overriding in more cases.</strong> These rules are a big enabler
for specialization, allowing it to accommodate many use cases that we
couldn&rsquo;t handle before. In particular, they enable us to add blanket
impls like <code>impl&lt;T: Copy&gt; Clone for T</code> in a backwards compatible
fashion, though only under certain conditions.</p>
<!-- more -->
<h3 id="revised-algorithm">Revised algorithm</h3>
<p>The key idea in this blog post is to change the rules for when some
impl I specializes another impl J. Instead of basing the rules on
&ldquo;subsets of types&rdquo;, I propose a two-tiered rule. Let me outline it
first and then I will go into more detail afterwards.</p>
<ol>
<li>First, <strong>impls with more specific types specialize other impls</strong>
(ignoring where clauses altogether).</li>
</ol>
<ul>
<li>So, for example, if impl I is <code>impl&lt;T: Clone&gt; Clone for Option&lt;T&gt;</code>, and impl J is <code>impl&lt;U: Copy&gt; Clone for U</code>, then I will
be used in preference to J, at least for those types where they
intersect (e.g., <code>Option&lt;i32&gt;</code>). This is because <code>Option&lt;T&gt;</code> is
more specific than <code>U</code>.
<ul>
<li>For types where they do not intersect (e.g., <code>i32</code> or <code>Option&lt;String&gt;</code>),
then only one impl is used.</li>
<li>Note that the where clauses like <code>T: Clone</code> and <code>U: Copy</code> don&rsquo;t matter
at all for this test.</li>
</ul>
</li>
</ul>
<ol start="2">
<li>However, <strong>reuse is only allowed if the full subset conditions are
met</strong>.</li>
</ol>
<ul>
<li>So, in our example, impl I is not a full subset of impl J, because
of types like <code>Option&lt;String&gt;</code>. This means that impl I could not
reuse items from impl J (and hence that all items in impl J must
be declared default).</li>
</ul>
<ol start="3">
<li>If the impls types are equally generic, then <strong>impls with more specific where clauses
specialize other impls</strong>.</li>
</ol>
<ul>
<li>So, for example, if impl I is <code>impl&lt;T: Debug&gt; Parse for T</code> and
impl J is <code>impl&lt;T&gt; Parse for T</code>, then impl I is used in preference
to impl J where possible. In particular, types that implement
<code>Debug</code> will prefer impl I.</li>
</ul>
<p>Another way to express the rule is to say that impls can specialize one
another in two ways:</p>
<ul>
<li>if the <strong>types matched by one impl are a subset of the other</strong>,
ignoring where clauses altogether;</li>
<li>otherwise, if the types matched by the two impls are the same, then
if the <strong>where clauses of one impl are more selective</strong>.</li>
</ul>
<p>Interestingly, and I&rsquo;ll go into this a bit more later, this rule is
not necessarily an <em>alternative</em> to the intersection impls I discussed
at first. In fact, the two can be used together, and complement each
other quite well.</p>
<h3 id="some-examples">Some examples</h3>
<p>Let&rsquo;s revisit some of the examples we&rsquo;ve been working through and see
how the rule would apply. The first three examples illustrate the
first three clauses. Then I&rsquo;ll show some other interesting examples
that highlight various other facets and interactions of the rules.</p>
<h4 id="blanket-impl-of-clone-for-copy-types">Blanket impl of Clone for Copy types</h4>
<p>First, we started out considering the case of trying to add a blanket
impl of <code>Clone</code> for all <code>Copy</code> types:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Copy</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Clone</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">default</span> <span class="k">fn</span><span class="w"> </span><span class="n">clone</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">*</span><span class="bp">self</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>We were concerned before that there are existing impls of <code>Clone</code> that
will partially overlap with this new blanket impl, but which will not
be full subsets of it, and which would therefore not be considered
specializations. For example, an impl for the <code>Option</code> type:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Clone</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Clone</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">clone</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="bp">self</span><span class="p">.</span><span class="n">as_ref</span><span class="p">().</span><span class="n">map</span><span class="p">(</span><span class="o">|</span><span class="n">c</span><span class="o">|</span><span class="w"> </span><span class="n">c</span><span class="p">.</span><span class="n">clone</span><span class="p">())</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Under these rules, this is no problem: the <code>Option</code> impl will take
precedence over the blanket impl, because its types are more specific.</p>
<p><strong>Note the interesting tie-in with the orphan rules here.</strong> When we add blanket
impls, we have to worry about backwards compatibility in one of two ways:</p>
<ul>
<li>existing impls will now fail coherence checks that used to pass;</li>
<li>some code that used to use an existing impl will silently change to
using the blanket impl instead.</li>
</ul>
<p>Naturally, the biggest concern is about impls in other crates, since
those impls are not visible to us. Interestingly, the orphan rules
require that those impls in other crates must be using <strong>some local
type</strong> in their signature. <strong>Thus I believe the orphan rules ensure
that existing impls in other crates will take precedence over our new
blanket impl</strong> &ndash; that is, we are guaranteed that they are considered
legal specializations, and hence will pass coherence, and moreover
that the existing impl is used in preference over the blanket one.</p>
<h3 id="dump-trait-reuse-requires-full-subset">Dump trait: Reuse requires full subset</h3>
<p>In <a href="https://smallcultfollowing.com/babysteps/
/blog/2016/09/29/distinguishing-reuse-from-override/">previous blog post</a> I gave an example of a <code>Dump</code> trait that
had a blanket impl for <code>Debug</code> things:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Dump</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">display</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">debug</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Dump</span><span class="w"> </span><span class="c1">// impl A
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="n">T</span>: <span class="nc">Debug</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">default</span> <span class="k">fn</span><span class="w"> </span><span class="n">display</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">debug</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">default</span> <span class="k">fn</span><span class="w"> </span><span class="n">debug</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;</span><span class="si">{:?}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The idea was that some other crate might want to specialize <code>Dump</code>
just to change how <code>display</code> works, perhaps trying something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Widget</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">Debug</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Debug</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Widget</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="o">..</span><span class="p">.}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// impl B (note that it is defined for all `T`, not `T: Debug`):
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Dump</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Widget</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">display</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Here, impl B only defines the <code>display()</code> item from the trait because
it intends to reuse the existing <code>debug()</code> method from impl A.
However, this poses a problem: impl A only applies when <code>Widget&lt;T&gt;: Debug</code>, which <em>may</em> be true but is not always true. In particular,
impl B is defined for any <code>Widget&lt;T&gt;</code>.</p>
<p>Under the rules I gave, this is an error. Here we have a scenario
where impl B <strong>does</strong> specialize impl A (because its types are more
specific), but <strong>impl B is not a full subset of impl A, and therefore
it cannot reuse items from impl A</strong>. It must provide a full definition
for all items in the trait (this also implies that every item in impl
A must be declared as <code>default</code>, as is the case here).</p>
<p>Note that either of these two alternatives for impl B would be fine:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// Alternative impl B.1: provides all items
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Dump</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Widget</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">display</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="o">..</span><span class="p">.}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">debug</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="o">..</span><span class="p">.}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// Alternative impl B.2: full subset
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">Debug</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Dump</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Widget</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">display</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="o">..</span><span class="p">.}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>There is some intersection with backwards compatibility here. If the
impl of <code>Dump</code> for <code>Widget</code> were added <strong>before</strong> impl A, then it
necessarily would have defined all items (as in impl B.1), and hence
there would be no error when impl A is added later.</p>
<h4 id="using-where-clauses-to-detect-debug">Using where clauses to detect <code>Debug</code></h4>
<p>You may have noticed that if you do an index into a map and the key is
not found,
<a href="https://is.gd/ARxIyV">the error message is kind of lackluster</a>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">use</span><span class="w"> </span><span class="n">std</span>::<span class="n">collections</span>::<span class="n">HashMap</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">map</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">HashMap</span>::<span class="n">new</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">map</span><span class="p">.</span><span class="n">insert</span><span class="p">(</span><span class="s">&#34;a&#34;</span><span class="p">,</span><span class="w"> </span><span class="s">&#34;b&#34;</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">map</span><span class="p">[</span><span class="o">&amp;</span><span class="s">&#34;c&#34;</span><span class="p">];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Error: thread &#39;main&#39; panicked at &#39;no entry found for key&#39;, ../src/libcore/option.rs:700
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>In particular, it doesn&rsquo;t tell you what key you were looking for! I
would have liked to see &rsquo;no entry found for &ldquo;c&rdquo;&rsquo;. Well, the reason for
this is that the map code doesn&rsquo;t require that the key type <code>K</code> have a
<code>Debug</code> impl.  That&rsquo;s good, but it&rsquo;d be nice if we could get a better
error if a debug impl <strong>happens to exist</strong>.</p>
<p>We might do so by using specialization. Let&rsquo;s imagine defining a trait
that can be used to panic when a key is not found. Thus when a map fails
to find a key, it invokes <code>key.not_found()</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">KeyNotFound</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">not_found</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="o">!</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">KeyNotFound</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="c1">// impl A
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">not_found</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="o">!</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="fm">panic!</span><span class="p">(</span><span class="s">&#34;no entry found for key&#34;</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">    
</span></span></span></code></pre></div><p>Now we could provide a specialized impl that kicks in when <code>Debug</code> is available:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">Debug</span><span class="o">&gt;</span><span class="w"> </span><span class="n">KeyNotFound</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="c1">// impl B
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">not_found</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="o">!</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="fm">panic!</span><span class="p">(</span><span class="s">&#34;no entry found for key `</span><span class="si">{:?}</span><span class="s">`&#34;</span><span class="p">,</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">    
</span></span></span></code></pre></div><p>Note that the types for impl B are not &ldquo;more specific&rdquo; than impl A,
unless you consider the where clauses. That is, they are both defined
for any type T. It is only when we consider the <em>where clauses</em> that
we see that impl B can in fact be judged more specific than A. This is
the third clause in my rules (it also works with specialization
today).</p>
<h4 id="fourth-example-asref">Fourth example: AsRef</h4>
<p>One longstanding ergonomic problem in the standard library has been
that we could add all of the impls of
<a href="https://doc.rust-lang.org/std/convert/trait.AsRef.html">the <code>AsRef</code> trait</a>
that we wanted. <code>T: AsRef&lt;U&gt;</code> is a trait that says &ldquo;an <code>&amp;T</code> reference
can be converted into a an <code>&amp;U</code> reference&rdquo;. It is particularly useful
for types that support slicing, like <code>String: AsRef&lt;str&gt;</code> &ndash; this
states that an <code>&amp;String</code> can be sliced into an <code>&amp;str</code> reference.</p>
<p>There are a number of blanket impls for <code>AsRef</code> that one might expect:</p>
<ul>
<li>Naturally one might expect that <code>T: AsRef&lt;T&gt;</code> would always hold.
That just says that an <code>&amp;T</code> reference can be converted into another
<code>&amp;T</code> reference (duh) &ndash; which is sometimes called being <em>reflexive</em>.</li>
<li>One might also that <code>AsRef</code> would be compatible with deref
coercions. That is, if I can convert an <code>&amp;U</code> reference to an <code>&amp;V</code>
reference, than I can also convert an <code>&amp;&amp;U</code> reference to an <code>&amp;V</code>
reference.</li>
</ul>
<p>Unfortunately, if you try to combine both of those two cases, the current
coherence rules reject it (I&rsquo;m going to ignore lifetime parameters here
for simplicity):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">AsRef</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="c1">// impl A
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">U</span><span class="p">,</span><span class="w"> </span><span class="n">V</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">AsRef</span><span class="o">&lt;</span><span class="n">V</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="o">&amp;</span><span class="n">U</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="n">U</span>: <span class="nb">AsRef</span><span class="o">&lt;</span><span class="n">V</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">  </span><span class="c1">// impl B
</span></span></span></code></pre></div><p>It&rsquo;s clear that these two impls, at least potentially, overlap.  In
particular, a trait reference like <code>&amp;Foo: AsRef&lt;&amp;Foo&gt;</code> could be
satisfied by either one (assuming that <code>Foo: AsRef&lt;&amp;Foo&gt;</code>, which is
probably not true in practice, but could be implemented by some type
<code>Foo</code> in theory).</p>
<p>At the same time, it&rsquo;s clear that neither represents a subset of one
another, even if ignore where clauses. Just consider these examples:</p>
<ul>
<li><code>String: AsRef&lt;String&gt;</code> (matches impl A, but not impl B)</li>
<li><code>&amp;String: AsRef&lt;String&gt;</code> (matches impl B, but not impl A)</li>
</ul>
<p>However, we&rsquo;ll see that we can satisfy this example if we incorporate
intersection impls; we&rsquo;ll cover this later.</p>
<h3 id="detailed-explanation-drilling-into-subset-of-types">Detailed explanation: drilling into subset of types</h3>
<p>OK, that was the high-level summary, let&rsquo;s start getting a bit more
into the details. In this section, I want to discuss how to implement
this new rule. I&rsquo;m going to assume you&rsquo;ve read and understood the
<a href="https://github.com/rust-lang/rfcs/blob/master/text/1210-impl-specialization.md#algorithmic-formulation">&ldquo;Algorithmic formulation&rdquo; section of the specialization RFC</a>,
which describes how to implement the subset check (if not, go ahead
and do so, it&rsquo;s quite readable &ndash; nice job aturon!).</p>
<p>Implementing the rules today basically consists of two distinct tests,
applied in succession. RFC 1210 describes how, given two impls I and
J, we can say define an ordering <em>Subset(I, J)</em> that indicates I
matches a subset of the types of J (the RFC calls it <code>I &lt;= J</code>). The
current rules then say that I <em>specializes</em> J if <em>Subset(I, J)</em>
holds but <em>Subset(J, I)</em> does not.</p>
<p>To decide if <em>Subset(I, J)</em> holds, we apply two tests (both of which
must pass):</p>
<ul>
<li><strong>Type(I, J):</strong> For any way of instantiating <code>I.vars</code>,
there is some way of instantiating <code>J.vars</code> such that the <code>Self</code>
type and trait type parameters match up.
<ul>
<li>Here <code>I.vars</code> refers to &ldquo;the generic parameters of impl I&rdquo;</li>
<li>The actual technique here is to <a href="https://github.com/rust-lang/rfcs/blob/master/text/1210-impl-specialization.md#skolemization-asking-forallthere-exists-questions">skolemize <code>I.vars</code></a> and
then <a href="https://github.com/rust-lang/rfcs/blob/master/text/1210-impl-specialization.md#unification-solving-equations-on-types">attempt unification</a>. If unification succeeds, then
<code>Type(I, J)</code> holds.</li>
</ul>
</li>
<li><strong>WhereClause(I, J):</strong> For the instantiation of <code>I.vars</code> used in
<em>Type(I, J)</em>, if you assume <code>I.wc</code> holds, you can prove <code>J.wc</code>.
<ul>
<li>Here <code>I.wc</code> refers to &ldquo;the where clauses of impl I&rdquo;.</li>
<li>The actual technique here is to consider <code>I.wc</code> as true,
and attempt to prove <code>J.wc</code> using the standard trait machinery.</li>
</ul>
</li>
</ul>
<p>The algorithm to test whether an impl I can specialize an impl J is this:</p>
<ul>
<li><em>Specializes(I, J)</em>:
<ul>
<li>If <em>Type(I, J)</em> holds:
<ul>
<li>If <em>Type(J, I)</em> does not hold:
<ul>
<li>true</li>
</ul>
</li>
<li>Otherwise, if <em>WhereClause(I, J)</em> holds:
<ul>
<li>If <em>WhereClause(J, I)</em> does not hold:
<ul>
<li>true</li>
</ul>
</li>
<li>else:
<ul>
<li>false</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
<li>false</li>
</ul>
</li>
</ul>
<p>You could also write this as <em>Specializes(I, J)</em> is:</p>
<pre tabindex="0"><code>Type(I, J) &amp;&amp; (!Type(J, I) || WhereClause(I, J) &amp;&amp; !WhereClause(J, I))
</code></pre><p>Unlike before, we also need a separate test to check whether <em>reuse</em>
is legal. Reuse is legal if <em>Subset(I, J)</em> holds.</p>
<p>You can view the <em>Specializes(I, J)</em> test as being based on a partial
order, where the <code>&lt;=</code> predicate is the lexicographic combination of
two other partial orders, <em>Type(I, J)</em> and <em>WhereClause(I, J)</em>. This
implies that it is transitive.</p>
<h3 id="combining-with-intersection-impls">Combining with intersection impls</h3>
<p>It&rsquo;s interesting to note that this rule can also be combined with the
rule for intersection impls. The idea of intersection impls is really
somewhat orthogonal to what exact test is being used to decide which
impl specializes another. Essentially, whereas without intersection
impls we say: &ldquo;two impls can overlap so long as one of them
specializes the other&rdquo;, we would now add the additional possibility
that &ldquo;two impls can overlap so long as some other impl specializes
both of them&rdquo;.</p>
<p>This is helpful for realizing some other patterns that we wanted to
get out of specialization but which, until now, we could not.</p>
<h4 id="example-asref">Example: AsRef</h4>
<p>We saw earlier that this new rule doesn&rsquo;t allow us to add the
reflexive <code>AsRef</code> impl that we wanted to add. However, using an
<strong>intersection impl</strong>, we can make progress. We can basically add a
third impl:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">AsRef</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="c1">// impl A
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">U</span><span class="p">,</span><span class="w"> </span><span class="n">V</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">AsRef</span><span class="o">&lt;</span><span class="n">V</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="o">&amp;</span><span class="n">U</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="n">U</span>: <span class="nb">AsRef</span><span class="o">&lt;</span><span class="n">V</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">  </span><span class="c1">// impl B
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">W</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">AsRef</span><span class="o">&lt;&amp;</span><span class="n">W</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="o">&amp;</span><span class="n">W</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="c1">// impl C
</span></span></span></code></pre></div><p>Impl C is a specialiation of both of the others, since every type it
can match can also be matched by the others. So this would be
accepted, since impl A and B overlap but have a common specializer.</p>
<p>(As an aside, you might also expect a generic transitivity impl, like
<code>impl&lt;T,U,V&gt; AsRef&lt;V&gt; for T where T: AsRef&lt;U&gt;</code>. I haven&rsquo;t thought much
about if such an impl would work with the specialization rules, since
I&rsquo;m pretty sure though that we&rsquo;d have to improve the trait matcher
implementation in any case to make it work, as I think right now it
would quickly overflow.)</p>
<h4 id="example-overlapping-blanket-impls-for-dump">Example: Overlapping blanket impls for Dump</h4>
<p>Let&rsquo;s see another, more conventional example where an intersection
impl might be useful. We&rsquo;ll return to our <code>Dump</code> trait.  If you
recall, it had a blanket impl that implemented <code>Dump</code> for any type <code>T</code>
where <code>T: Debug</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Dump</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">display</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">debug</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Dump</span><span class="w"> </span><span class="c1">// impl A
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="n">T</span>: <span class="nc">Debug</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">default</span> <span class="k">fn</span><span class="w"> </span><span class="n">display</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">debug</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">default</span> <span class="k">fn</span><span class="w"> </span><span class="n">debug</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;</span><span class="si">{:?}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>But we might also want another blanket impl for types where <code>T: Display</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Dump</span><span class="w"> </span><span class="c1">// impl B
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="n">T</span>: <span class="nc">Display</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">default</span> <span class="k">fn</span><span class="w"> </span><span class="n">display</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;</span><span class="si">{}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">default</span> <span class="k">fn</span><span class="w"> </span><span class="n">debug</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">display</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now we have a problem. Impl A and B clearly potentially overlap, but
(a) neither is more specific in terms of its types (both apply to any
type <code>T</code>, so <em>Type(A, B)</em> and <em>Type(B, A)</em> will both hold) and (b)
neither is more specific in terms of its where-clauses: one applies to
types that implement <code>Debug</code>, and one applies to types that implement
<code>Display</code>, but clearly types can implement both.</p>
<p>With intersection impls we could resolve this error by providing
a third impl for types <code>T</code> where <code>T: Debug + Display</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Dump</span><span class="w"> </span><span class="c1">// impl C
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="n">T</span>: <span class="nc">Debug</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">Display</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">default</span> <span class="k">fn</span><span class="w"> </span><span class="n">display</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;</span><span class="si">{}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">default</span> <span class="k">fn</span><span class="w"> </span><span class="n">debug</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;</span><span class="si">{:?}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h4 id="orphan-rules-blanket-impls-and-negative-reasoning">Orphan rules, blanket impls, and negative reasoning</h4>
<p>Traditionally, we have said that it is considering backwards
compatible (in terms of semver) to add impls for traits, with the
exception of &ldquo;backwards impls&rdquo; that apply to all <code>T</code>, even if <code>T</code> is
guarded by some traits (like the impls we saw for <code>Dump</code> in the
previous section). This is because if I add an impl like <code>impl&lt;T: Debug&gt; Dump for T</code> where none existed before, some other crate may
already have an impl like <code>impl Dump for MyType</code>, and then if <code>MyType: Debug</code>, we would have an overlap conflict, and hence that downstream
crate will not compile (see <a href="https://github.com/rust-lang/rfcs/blob/master/text/1023-rebalancing-coherence.md">RFC 1023</a> for more information on
these rules).</p>
<p>This new proposed specialization rule has the potential to change that
balance. In fact, at first you might think that adding a blanket impl
would <strong>always</strong> be legal, as long as all of its members are declared
<code>default</code>. After all, any pre-existing impl from another crate must,
because of <a href="http://smallcultfollowing.com/babysteps/blog/2015/01/14/little-orphan-impls/">the orphan rules</a>, have more specific types, and
will thus take precedence over the default impl (moreover, since there
was nothing for this impl to inherit from before, it must still
inherit). So something like <code>impl Dump for MyType</code> would still be
legal, right?</p>
<p>But there is actually still a risk from blanket impls around
<strong>negative reasoning</strong>. To see what I mean, let&rsquo;s continue with a
simplified variant of the <code>Dump</code> example from the previous section
which doesn&rsquo;t use intersection impls. So imagine that we have the
<code>Dump</code> trait and the following impls:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// crate `dump`
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Dump</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">Display</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Dump</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">Debug</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">Display</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Dump</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>So, these are pre-existing impls. Now, imagine that in the standard
library, we decided to add a kind of &ldquo;fallback&rdquo; impl of <code>Debug</code> that
says &ldquo;any type which implements <code>Display</code>, automatically implements
<code>Debug</code>&rdquo;:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">Display</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Debug</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">fmt</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">fmt</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">Formatter</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Result</span><span class="o">&lt;</span><span class="p">(),</span><span class="w"> </span><span class="n">Error</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Display</span>::<span class="n">fmt</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">fmt</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Interestingly, this impl creates a problem for the crate <code>dump</code>!
Before, its two impls were well-ordered; one applied to types that
implement <code>Display</code>, and one applied to types that implement both
<code>Debug</code> and <code>Display</code>. But with this new impl, <em>all</em> types that
implement <code>Display</code> also implement <code>Debug</code>, so this distinction is
meaningless.</p>
<p>But wait, you cry! That impl looks awfully familiar to our motivating
example from the very first post! Remember that this all started because
we wanted to implement <code>Clone</code> for all <code>Copy</code> types:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Copy</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Clone</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>So is that actually illegal?</p>
<p>It turns out that there is a crucial difference between these two. It
does not lie in the <em>impls</em>, but rather in the <em>traits</em>. In
particular, the <code>Copy</code> trait is a <em>subtrait</em> of <code>Clone</code> &ndash; that is,
anything which is copyable must also be cloneable. But <code>Display</code> and
<code>Debug</code> have no relationship; in fact, the blanket impl
interconverting between them is effectively <em>imposing</em> an
<strong>undeclared</strong> subtrait relationship <code>Display: Debug</code>. After all, now
some type T implements <code>Display</code>, we are guaranteed that it also
implements <code>Debug</code>.</p>
<p><strong>So this suggests that the new rule for semver compatibility is that
one can add blanket impls after the fact, but only if a subtrait
relationship already existed.</strong></p>
<p>As an aside, this &ndash; along with the
<a href="https://github.com/rust-lang/rfcs/pull/1658#issuecomment-249453099">similar example raised by withoutboats and reddit user oconnor663</a>
&ndash; strongly suggests to me that traits need to &ldquo;predeclare&rdquo; strong
relationships, like subtraits but also mutual exclusion if we ever
support that, at the point when they are created. I know withoutboats
has some interesting thoughts in this direction. =)</p>
<p>However, another possibility that aturon raised is to use a more
<em>syntactic</em> criteria for when something is more specialized &ndash; in that
case, <code>Debug+Display</code> would be considered more specialized than
<code>Display</code>, even if in reality they are equivalent. This may wind up
being easier to understand &ndash; and more flexible &ndash; even if it is less
smart.</p>
<h3 id="conclusion">Conclusion</h3>
<p>This post lays out an alternative specialization predicate that I
believe helps to overcome a lot of the shortcomings of the current
<em>subset</em> rule. The rule is fairly simple to describe: <strong>impls with
more specific types get precedence</strong>. If the types of two impls are
equally generic, then the impl with <strong>more specific where-clauses gets
precedence</strong>. I claim this rule is intuitive in practice; perhaps more
intuitive than the current rule.</p>
<p>This predicate allows for a number of scenarios that the current
specialization rule excludes, but which we wanted initially.  The ones
I have considered mostly fall into the category of adding an impl of a
supertrait in terms of a subtrait backwards compatibly:</p>
<ul>
<li><code>impl&lt;T: Copy&gt; Clone for T { ... }</code></li>
<li><code>impl&lt;T: Eq&gt; PartialEq for T { ... }</code></li>
<li><code>impl&lt;T: Ord&gt; PartialOrd for T { ... }</code></li>
</ul>
<p>If we combine with intersection impls, we can also accommodate the
<code>AsRef</code> impl, and also get better support for having overlapping
blanket impls. I&rsquo;d be interested to hear about other cases where the
coherence rules were limiting that may be affected by specializaton,
so we can see how they fare.</p>
<p><strong>One sour note has to do with negative reasoning.</strong> Specialization
based on where clauses (orthogonally from the changes proposed in this
post, in fact) introduces a kind of negative reasoning that is not
currently subject to the rules in <a href="https://github.com/rust-lang/rfcs/blob/master/text/1023-rebalancing-coherence.md">RFC 1023</a>. This implies that
crates cannot add blanket impls with impunity. In particular,
introducing subtrait relationships can still cause problems, which
affects a number of suggested &ldquo;bridge&rdquo; cases:</p>
<ul>
<li><code>impl&lt;R, T: Add&lt;R&gt; + Clone&gt; AddAssign&lt;R&gt; for T</code>
<ul>
<li>anything that has <code>Add</code> and <code>Clone</code> is now <code>AddAssign</code></li>
</ul>
</li>
<li><code>impl&lt;T: Display&gt; Debug for T</code>
<ul>
<li>anything that is <code>Debug</code> is now <code>Display</code></li>
</ul>
</li>
</ul>
<p>There may be some room to revise the specialization rules to address
this, by tweaking the <em>WhereClause(I, J)</em> test to be more
conservative, or to be more syntactical in nature. This will require
some further experimentation and tinkering.</p>
<h3 id="comments">Comments</h3>
<p>Please leave comments in
<a href="https://internals.rust-lang.org/t/blog-post-supporting-blanket-impls-in-specialization/4264">this internals thread</a>.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/specialization" term="specialization" label="Specialization"/><category scheme="https://smallcultfollowing.com/babysteps/categories/traits" term="traits" label="Traits"/></entry><entry><title type="html">Observational equivalence and unsafe code</title><link href="https://smallcultfollowing.com/babysteps/blog/2016/10/02/observational-equivalence-and-unsafe-code/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2016/10/02/observational-equivalence-and-unsafe-code/</id><published>2016-10-02T00:00:00+00:00</published><updated>2016-10-02T07:06:23-04:00</updated><content type="html"><![CDATA[<p>I spent a really interesting day last week at Northeastern University.
First, I saw a fun talk by Philip Haller covering <a href="http://2016.splashcon.org/event/splash-2016-oopsla-lacasa-lightweight-affinity-and-object-capabilities-in-scala">LaCasa</a>, which is a
set of extensions to Scala that enable it to track ownership. Many of
the techniques reminded me very much of Rust (e.g., the use of
&ldquo;spores&rdquo;, which are closures that can limit the types of things they
close over); if I have time, I&rsquo;ll try to write up a more detailed
comparison in some later post.</p>
<p>Next, I met with <a href="http://www.ccs.neu.edu/home/amal/">Amal Ahmed</a> and her group to discuss the process of
crafting unsafe code guidelines for Rust. This is one very impressive
group. It&rsquo;s this last meeting that I wanted to write about now. The
conversation helped me quite a bit to more cleanly separate two
distinct concepts in my mind.</p>
<p>The TL;DR of this post is that I think we can limit the capabilities
of unsafe code to be &ldquo;things you could have written using the safe
code plus a core set of unsafe abstractions&rdquo; (ignoring the fact that
the safe implementation would be unusably slow or consume ridiculous
amounts of memory). This is a helpful and important thing to be able
to nail down.</p>
<!-- more -->
<h3 id="background-observational-equivalence">Background: observational equivalence</h3>
<p>One of the things that we talked about was <strong>observational
equivalence</strong> and how it relates to the unsafe code guidelines. The
notion of observational equivalence is really pretty simple: basically
it means &ldquo;two bits of code do the same thing, as far as you can tell&rdquo;.
I think it&rsquo;s easiest to think of it in terms of an API. So, for
example, consider the <code>HashMap</code> and <code>BTreeMap</code> types in the Rust
standard library. Imagine I have some code using a <code>HashMap&lt;i32, T&gt;</code>
that only invokes the basic map operations &ndash; e.g., <code>new</code>, <code>get</code>, and
<code>insert</code>. I would expect to be able to change that code to use a
<code>BTreeMap&lt;i32, T&gt;</code> and have it keep working. This is because <code>HashMap</code>
and <code>BTreeMap</code>, at least with respect to <code>i32</code> keys and
<code>new</code>/<code>get</code>/<code>insert</code>, are <strong>observationally equivalent</strong>.</p>
<p>If I expand the set of API routines that I use, however, this
equivalence goes away. For example, if I iterate over the map, then a
<code>BTreeMap</code> gives me an ordering guarantee, whereas <code>HashMap</code> doesn&rsquo;t.</p>
<p>Note that the speed and memory use will definitely change as I shift
from one to the other, but I still consider them observationally
equivalent. This is because I consider such changes &ldquo;unobservable&rdquo;, at
least in this setting (crypto code might beg to differ).</p>
<h3 id="composing-unsafe-abstractions">Composing unsafe abstractions</h3>
<p>One thing that I&rsquo;ve been kind of wrestling with in the unsafe code
guidelines is how to break it up. A lot of the attention has gone into
thinking about some very low-level decisions: for example, if I make a
<code>*mut</code> pointer and an <code>&amp;mut</code> reference, when can they legally alias?
But there are some bigger picture questions that are also equally
interesting: what kinds of things can unsafe code <strong>even do</strong> in the
first place, whatever types it uses?</p>
<p>One example that I often give has to do with the infamous
<code>setjmp</code>/<code>longjmp</code> in C. These are some routines that let you
implement a poor man&rsquo;s exception handling. You call <code>setjmp</code> at one
stack frame and then, down the stack, you call <code>longjmp</code>. This will
cause all the intermediate stack frames to be popped (with no
unwinding or other cleanup) and control to resume from the point where
you called <code>setjmp</code>.  You can use this to model exceptions (a la
Objective C),
<a href="http://fanf.livejournal.com/105413.html">build coroutines</a>, and of
course &ndash; this <em>is</em> C &ndash; to shoot yourself in the foot (for example,
by invoking <code>longjmp</code> when the stack frame that called <code>setjmp</code> has
already returned).</p>
<p>So you can imagine someone writing a Rust wrapper for
<code>setjmp</code>/<code>longjmp</code>. You could easily guarantee that people use the API
in a correct way: e.g., that you when you call <code>longjmp</code>, the <code>setjmp</code>
frame is still on the stack, but does that make it <strong>safe</strong>?</p>
<p>One concern is that <code>setjmp</code>/<code>longjmp</code> do not do any form of
unwinding. This means that all of the intermediate stack frames are
going to be popped and none of the destructors for their local
variables will run. This certainly means that memory will leak, but it
<a href="https://www.reddit.com/r/rust/comments/508pkb/unleakable_crate_safetysanityrefocus/d72703d">can have much worse effects if you try to combine it with other unsafe abstractions</a>. Imagine
for example that you are using <a href="https://github.com/nikomatsakis/rayon/">Rayon</a>: Rayon relies on running
destructors in order to join its worker threads. So if a user of the
<code>setjmp</code>/<code>longjmp</code> API wrote something like this, that would be very
bad:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">setjmp</span><span class="p">(</span><span class="o">|</span><span class="n">j</span><span class="o">|</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">rayon</span>::<span class="n">join</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="cm">/* original thread */</span><span class="p">;</span><span class="w"> </span><span class="n">j</span><span class="p">.</span><span class="n">longjmp</span><span class="p">();</span><span class="w"> </span><span class="p">},</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">||</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="cm">/* other thread */</span><span class="w"> </span><span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">});</span><span class="w">
</span></span></span></code></pre></div><p>What is happening here is that we are first calling <code>setjmp</code> using our
&ldquo;safe&rdquo; wrapper. I&rsquo;m imagining that this takes a closure and supplies
it some handle <code>j</code> that can be used to &ldquo;longjmp&rdquo; back to the <code>setjmp</code>
call (basically like <code>break</code> on steroids). Now we call <code>rayon::join</code>
to (potentially) spin off another thread. The way that <code>join</code> works is
that the first closure executes on the current thread, but the second
closure may get stolen and execute on another thread &ndash; in that case,
the other thread will be joined before <code>join</code> returns. But here we are
calling <code>j.longjmp()</code> in the first closure. This will skip right over
the destructor that would have been used to join the second thread.
So now potentially we have some other thread executing, accessing
stack data and raising all kinds of mischief.</p>
<p>(Note: the current signature of <code>join</code> would probably prohibit this,
since it does not reflect the fact that the first closure is known to
execute in the original thread, and hence requires that it close over
only sendable data, but I&rsquo;ve contemplated changing that.)</p>
<p>So what went wrong here? We tried to combine two things that
independently seemed <em>safe</em> but wound up with a broken system. How did
that happen? The problem is that when you write unsafe code, you are
not only thinking about what your code <strong>does</strong>, you&rsquo;re thinking about
what the outside world <strong>can do</strong>. And in particular you are modeling
the potential actions of the outside world using the limits of <strong>safe
code</strong>.</p>
<p>In this case, Rayon was making the assumption that when we call a closure,
that closure will do one of four things:</p>
<ul>
<li>loop infinitely;</li>
<li>abort the process and all its threads;</li>
<li>unwind;</li>
<li>return normally.</li>
</ul>
<p>This is true of all safe code &ndash; unless that safe code has access to
<code>setjmp</code>/<code>longjmp</code>.</p>
<p>This illustrates the power of unsafe abstractions. They can extend the
very vocabulary with which safe code speaks. (Sorry, I know that was
ludicrously flowery, but I can&rsquo;t bring myself to delete it.) Unsafe
abstractions can extend the <strong>capabilities</strong> of safe code. This is
very cool, but also &ndash; as we see here &ndash; potentially
dangerous. <strong>Clearly, we need some guidelines to decide what kinds of
capabilities it is ok to add and which are not.</strong></p>
<h3 id="comparing-setjmplongjmp-and-rayon">Comparing setjmp/longjmp and rayon</h3>
<p>But how can we decide what capabilities to permit and which to deny?
This is where we get back to this notion of <em>observational
equivalence</em>. After all, both Rayon and setjmp/longjmp give the user
some new powers:</p>
<ul>
<li>Rayon lets you run code in different threads.</li>
<li>Setjmp/longjmp lets you pop stack frames without returning or unwinding.</li>
</ul>
<p>But these two capabilities are qualitiatively different. For the most
part, Rayon&rsquo;s superpower is <strong>observationally equivalent</strong> to safe
Rust. That is, I could implement Rayon without using threads at all
and you as a safe code author couldn&rsquo;t tell the difference, except for
the fact that your code runs slower (this is a slight simplification;
I&rsquo;ll elaborate below). <strong>In contrast, I cannot implement
setjmp/longjmp using safe code.</strong></p>
<p><strong>&ldquo;But wait&rdquo;, you say, &ldquo;Just what do you mean by &lsquo;safe code&rsquo;?&rdquo;</strong> OK,
That last paragraph was really sloppy. I keep saying things like &ldquo;you
could do this in safe Rust&rdquo;, but of course we&rsquo;ve already seen that the
very notion of what &ldquo;safe Rust&rdquo; can do is something that <strong>unsafe code
can extend</strong>. So let me try to make this more precise. Instead of
talking about <em>Safe Rust</em> as it was a monolithic entity, we&rsquo;ll
gradually build up more expressive versions of Rust by taking a safe
code and adding unsafe capabilities. Then we can talk more precisely
about things.</p>
<h3 id="rust0--the-safe-code">Rust0 &ndash; the safe code</h3>
<p>Let&rsquo;s start with Rust0, which corresponds to what you can do without
using <strong>any unsafe code at all, anywhere</strong>. Rust0 is a remarkably
incapable language. The most obvious limitation is that you have no
access to the heap (<code>Box</code> and <code>Vec</code> are unsafely implemented
libraries), so you are limited to local variables. You can still do
quite a lot of interesting things: you have arrays and slices,
closures, enums, and so forth. But everything must live on the stack
and hence ultimately follow a stack discipline. Essentially, you can
never return anything from a function whose size is not statically
known. We can&rsquo;t even use static variables to stash stuff, since those
are inherently shared and hence immutable unless you have some unsafe
code in the mix (e.g., <code>Mutex</code>).</p>
<h3 id="rust1--the-heap-vec">Rust1 &ndash; the heap (<code>Vec</code>)</h3>
<p>So now let&rsquo;s consider Rust1, which is Rust0 but with access to <code>Vec</code>.
We don&rsquo;t have to worry about how <code>Vec</code> is implemented. Instead, we can
just think of <code>Vec</code> as if it were part of Rust itself (much like how
<code>~[T]</code> used to be, in the bad old days). Suddenly our capabilities are
much increased!</p>
<p>For example, one thing we can do is to implement the <code>Box</code> type
(<code>Box&lt;T&gt;</code> is basically a <code>Vec&lt;T&gt;</code> whose length is always 1, after
all). We can also implement something that acts identically to
<code>HashMap</code> and <code>BTreeMap</code> in pure safe code (obviously the performance
characteristics will be different).</p>
<p>(At first, I thought that giving access to <code>Box</code> would be enough, but
you can&rsquo;t really simulate <code>Vec</code> just by using <code>Box</code>. Go ahead and try
and you&rsquo;ll see what I mean.)</p>
<h3 id="rust2--sharing-rc-arc">Rust2 &ndash; sharing (<code>Rc</code>, <code>Arc</code>)</h3>
<p>This is sort of an interesting one. Even if you have <code>Vec</code>, you still
cannot implement <code>Rc</code> or <code>Arc</code> in Rust1. At first, I thought perhaps we could
fake it by cloning data &ndash; so, for example, if you want a <code>Rc&lt;T&gt;</code>, you
could (behind the scenes) make a <code>Box&lt;T&gt;</code>. Then when you clone the
<code>Rc&lt;T&gt;</code> you just clone the box. Since we don&rsquo;t yet have <code>Cell</code> or
<code>RefCell</code>, I reasoned, you wouldn&rsquo;t be ablle to tell that the data had
been cloned. But of course that won&rsquo;t work, because you can use a
<code>Rc&lt;T&gt;</code> for <strong>any</strong> <code>T</code>, not just <code>T</code> that implement <code>Clone</code>.</p>
<h3 id="rust3--non-atomic-mutation">Rust3 &ndash; non-atomic mutation</h3>
<p>That brings us to another fundamental capability. <code>Cell</code> and <code>RefCell</code>
permit mutation when data is shared. This can&rsquo;t be modeled with just
<code>Rc</code>, <code>Box</code>, or <code>Vec</code>, all of which maintain the invariant that
mutable data is uniquely reachable.</p>
<h3 id="rust4--asynchronous-threading">Rust4 &ndash; asynchronous threading</h3>
<p>This is an interesting level. Here we add the ability to spawn a
thread, as described in <code>std::thread</code> (note that this thread runs
asynchronously and cannot access data on the parent&rsquo;s stack frame). At
first, I thought that threading didn&rsquo;t add &ldquo;expressive power&rdquo; since we
lacked the ability to share <strong>mutable</strong> data across threads (we can
share immutable data with <code>Arc</code>).</p>
<p>After all, you could implement <code>std::thread</code> in safe code by having it
queue up the closure to run and then, when the current thread
finishes, have it execute. This isn&rsquo;t <strong>really</strong> correct for a number
of reasons (what is this scheduler that overarches the safe code?
Where do you queue up the data?), but it seems <em>almost</em> true.</p>
<p>But there is another way that adding <code>std::thread</code> is important. It
means that safe code can <strong>observe</strong> memory in an asynchronous thread,
which affects the kinds of <strong>unsafe code</strong> that we might write. After
all, the whole purpose of this exercise is to figure out the limits of
what safe code can do, so that unsafe code knows what it has to be
wary of. So long as safe code did not have access to <code>std::thread</code>,
one could <strong>imagine</strong> writing an unsafe function like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="p">(</span><span class="n">x</span>: <span class="kp">&amp;</span><span class="nc">Arc</span><span class="o">&lt;</span><span class="kt">i32</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">p</span>: <span class="o">*</span><span class="k">const</span><span class="w"> </span><span class="kt">i32</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;*</span><span class="n">x</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">q</span>: <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="kt">i32</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">p</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="kt">i32</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">*</span><span class="n">q</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">*</span><span class="n">q</span><span class="w"> </span><span class="o">-=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This function takes a shared <code>i32</code> and <strong>temporarily</strong> increments and
then decrements it. The important point here is that the invariant
that the <code>Arc&lt;i32&gt;</code> is immutable is broken, but it is restored before
<code>foo</code> returns. Without threads, safe code can&rsquo;t tell the difference
between <code>foo(&amp;my_arc)</code> and a no-op. But with threads, <code>foo()</code> might
trigger a data-race. (This is all leaving aside the question of
compiler optimization and aliasing rules, of course.)</p>
<p>(Hat tip to Alan Jeffreys for pointing this out to me.)</p>
<h3 id="rust5--communication-between-threads-and-processes">Rust5 &ndash; communication between threads and processes</h3>
<p>The next level I think are abstractions that enable threads to
communiate with one another. This includes both within a process
(e.g., <code>AtomicU32</code>) and across processes (e.g., I/O).</p>
<p>This is an interesting level to me because <strong>I think</strong> it represents
the point where the effects of a library like rayon becomes observable
to safe code. Until this point, the only data that could be shared
across Rayon threads was immutable, and hence I think the precise
interleavings could also be simulated. But once you throws atomics
into the mix, and in particular the fact that atomics give you control
over the memory model (i.e., they do not require sequential
consistency), then you can definitely observe whether threading is
truly in use. The same is true for I/O and so forth.</p>
<p>So this is the level that shows that what I wrote earlier, that
&ldquo;Rayon&rsquo;s superpower is observationally equivalent to safe Rust&rdquo; is
actually false. I think it <strong>is</strong> observationally equivalent to &ldquo;safe
Rust4&rdquo;, but not Rust5. Basically Rayon serves as a kind of &ldquo;Rust6&rdquo;, in
which we grow Rust5 by adding scoped threads, that allow sharing data
on stack frames.</p>
<h3 id="and-so-on">And so on</h3>
<p>We can keep going with this exercise, which I actually think is quite
valuable, but I&rsquo;ll stop here for now. What I&rsquo;d like to do
asynchronously is to go over the standard library and interesting
third-party packages and try to nail down the &ldquo;core unsafe
abstractions&rdquo; that you need to build Rust, as well as the
&ldquo;dependencies&rdquo; between them.</p>
<p>But I want to bring this back to the core point: the focus in the
unsafe code guidelines has been on exploring what unsafe code can do
&ldquo;in the small&rdquo;.  Basically, what types it ought to use to achieve
certain kinds of aliasing and so forth. <strong>But I think it&rsquo;s also very
important to nail down what unsafe code can do &ldquo;in the large&rdquo;.</strong> How
do we know whether (say)
<a href="https://github.com/frankmcsherry/abomonation">abomonation</a>,
<a href="https://crates.io/crates/deque">deque</a>, and so forth represent legal
libraries?</p>
<p>As I left the meeting with Amal&rsquo;s group, she posed this question to
me. Is there something where all three of these things are true:</p>
<ul>
<li>you cannot simulate using the standard library;</li>
<li>you <strong>can</strong> do with unsafe code;</li>
<li>and it&rsquo;s a &ldquo;reasonable&rdquo; thing to do.</li>
</ul>
<p>Whenever the answer is yes, that&rsquo;s a candidate for growing another
Rust level. We already saw one &ldquo;yes&rdquo; answer in this blog post, right
at the end: scoped threads, which enable threading with access to
stack contents. Beyond that, most of the potential answers I&rsquo;ve come
up with are access to various kernel capabilities:</p>
<ul>
<li>dynamic linking;</li>
<li>shared memory across processes;</li>
<li>processes themselves. =)</li>
</ul>
<p>What&rsquo;s a bit interesting about these is that they seem to be mostly
about the operating system itself. They don&rsquo;t feel &ldquo;fundamental&rdquo; in
the same way as scoped threads: in other words, you could imagine
simulating the O/S itself in safe code, and then you could build these
things. Not quite how to think about <em>that</em> yet.</p>
<p>In any case, I&rsquo;d be interested to hear about other &ldquo;fundamental
abstractions&rdquo; that you can think of.</p>
<h3 id="coda-picking-and-choosing-your-language-levels">Coda: Picking and choosing your language levels</h3>
<p>Oh, one last thing. It might seem like defining all these language
levels is a bit academic. But it can be very useful to pick them
apart. For example, imagine you are targeting a processor that has no
preemption and always uses cooperative multithreading. In that case,
the concerns I talked about in Rust4 may not apply, and you may be
able to do more aggressive things in your unsafe code.</p>
<h3 id="comments">Comments</h3>
<p>Please leave comments in
<a href="https://internals.rust-lang.org/t/blog-post-observatonal-equivalence-and-unsafe-code/4148/1">this thread on the Rust internals forum</a>.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/unsafe" term="unsafe" label="Unsafe"/></entry><entry><title type="html">Announcing intorust.com</title><link href="https://smallcultfollowing.com/babysteps/blog/2016/09/30/announcing-intorust-dot-com/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2016/09/30/announcing-intorust-dot-com/</id><published>2016-09-30T00:00:00+00:00</published><updated>2016-09-30T15:48:54-04:00</updated><content type="html"><![CDATA[<p>For the past year or so, I and a few others have been iterating on
some tutorial slides for learning Rust. I&rsquo;ve given this tutorial here
at the local <a href="http://www.meetup.com/BostonRust/">Boston Rust Meetup</a> a few times, and we used the same
basic approach at RustConf; I&rsquo;ve been pretty happy with the
results. But until now it&rsquo;s been limited to &ldquo;in person&rdquo; events.</p>
<p>That&rsquo;s why I&rsquo;m so happy to announce a new site, <a href="http://intorust.com">Into Rust</a>. Into Rust
contains screencasts of many of these slides, and in particular the
ones I consider most important: those that cover Ownership and
Borrowing, which I think is the best place to start teaching Rust.
I&rsquo;ve divided up the material into roughly 30min screencasts so that
they should be relatively easy to consume in one sitting &ndash; each also
has some associated exercises to help make your knowledge more
concrete.</p>
<p>I want to give special thanks to <a href="https://twitter.com/_lizbaillie/">Liz Baillie</a>, who did all the
awesome artwork on the site.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/intorust" term="intorust" label="IntoRust"/></entry><entry><title type="html">Distinguishing reuse from override</title><link href="https://smallcultfollowing.com/babysteps/blog/2016/09/29/distinguishing-reuse-from-override/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2016/09/29/distinguishing-reuse-from-override/</id><published>2016-09-29T00:00:00+00:00</published><updated>2016-09-29T06:02:19-04:00</updated><content type="html"><![CDATA[<p>In my <a href="https://smallcultfollowing.com/babysteps/
/blog/2016/09/24/intersection-impls/">previous post</a>, I started discussing the idea of
intersection impls, which are a possible extension to
<a href="https://github.com/rust-lang/rfcs/pull/1210">specialization</a>. I am specifically looking at the idea of
making it possible to add blanket impls to (e.g.) implement <code>Clone</code>
for any <code>Copy</code> type. We saw that intersection impls, while useful, do
not enable us to do this in a backwards compatible way.</p>
<p>Today I want to dive a bit deeper into specialization. We&rsquo;ll see that
specialization actually couples together two things: refinement of
behavior and reuse of code. This is no accident, and its normally a
natural thing to do, but I&rsquo;ll show that, in order to enable the kinds
of blanket impls I want, it&rsquo;s important to be able to tease those
apart somewhat.</p>
<p>This post doesn&rsquo;t really propose anything. Instead it merely explores
some of the implications of having specialization rules that are not
based purely on &ldquo;subsets of types&rdquo;, but instead go into other areas.</p>
<!-- more -->
<h3 id="requirements-for-backwards-compatibility">Requirements for backwards compatibility</h3>
<p>In the previous post, my primary motivating example focused on the
<code>Copy</code> and <code>Clone</code> traits. Specifically, I wanted to be able to add an
impl like the following (we&rsquo;ll call it &ldquo;impl A&rdquo;):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Copy</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Clone</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="c1">// impl A
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">default</span> <span class="k">fn</span><span class="w"> </span><span class="n">clone</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Point</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">*</span><span class="bp">self</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The idea is that if I have a <code>Copy</code> type, I should not have to write a
<code>Clone</code> impl by hand. I should get one automatically.</p>
<p>The problem is that there are already lots of <code>Clone</code> impls &ldquo;in the
wild&rdquo; (in fact, every <code>Copy</code> type has one, since <code>Copy</code> is a subtrait
of <code>Clone</code>, and hence implementing <code>Copy</code> requires implememting
<code>Clone</code> too). To be backwards compatible, we have to do two things:</p>
<ul>
<li>continue to compile those <code>Clone</code> impls without generating errors;</li>
<li>give those existing <code>Clone</code> impls <strong>precedence</strong> over the new one.</li>
</ul>
<p>The last point may not be immediately obvious. What I&rsquo;m saying is that
if you already had a type with a <code>Copy</code> and a <code>Clone</code> impl, then any
attempts to clone that type need to keep calling the <code>clone()</code> method
you wrote. Otherwise the behavior of your code might change in subtle
ways.</p>
<p>So for example imagine that I am developing a <code>widget</code> crate with some
types like these:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Widget</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">data</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Copy</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Copy</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Widget</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="c1">// impl B
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Clone</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Clone</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Widget</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="c1">// impl C
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">clone</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Widget</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">Widget</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">data</span>: <span class="nc">self</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="n">clone</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Then, for backwards compatibility, we want that if I have a variable
<code>widget</code> of type <code>Widget&lt;T&gt;</code> <strong>for any <code>T</code></strong> (including cases where
<code>T: Copy</code>, and hence <code>Widget&lt;T&gt;: Copy</code>), then <code>widget.clone()</code> invokes
impl C.</p>
<h3 id="thought-experiment-named-impls-and-explicit-specialization">Thought experiment: Named impls and explicit specialization</h3>
<p>For the purposes of this post, I&rsquo;d like to engage now in a thought
experiment. Imagine that, instead of using type subsets as the basis
for specialization, we gave every impl a name, and we could explicitly
specify when one impl specializes another using that name. When I say
that an impl X <em>specializes</em> an impl Y, I mean primarily that items in
the impl X <strong>override</strong> items in impl Y:</p>
<ul>
<li>When we go looking for an associated item, we use the one in X first.</li>
</ul>
<p>However, in the specialization RFC as it currently stands,
specializing is also tied to <strong>reuse</strong>. In particular:</p>
<ul>
<li>If there is no item in X, then we go looking in Y.</li>
</ul>
<p>The point of this thought experiment is to show that we may want to
separate these two concepts.</p>
<p>To avoid inventing syntax, I&rsquo;ll use a <code>#[name]</code> attribute to specify
the name of an impl and a <code>#[specializes]</code> attribute to declare when
one impl specializes another. So we might declare our two <code>Clone</code>
impls from the previous section as follows:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cp">#[name = </span><span class="s">&#34;A&#34;</span><span class="cp">]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Copy</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Clone</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="p">{</span><span class="o">..</span><span class="p">.}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="cp">#[name = </span><span class="s">&#34;B&#34;</span><span class="cp">]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="cp">#[specializes = </span><span class="s">&#34;A&#34;</span><span class="cp">]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Clone</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Clone</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Widget</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="o">..</span><span class="p">.}</span><span class="w">
</span></span></span></code></pre></div><p>Interestingly, it turns out that this scheme of using explicit names
interacts really poorly with the <strong>reuse</strong> aspects of the
specialization RFC. The <code>Clone</code> trait is kind of too simple to show
what I mean, so let&rsquo;s consider an alternative trait, <code>Dump</code>, which has
two methods:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Dump</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">display</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">debug</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now imagine that I have a blanket implementation of <code>Dump</code> that
applies to any type that implements <code>Debug</code>. It defines both
<code>display</code> and <code>debug</code> to print to <code>stdout</code> using the <code>Debug</code>
trait. Let&rsquo;s call this &ldquo;impl D&rdquo;.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cp">#[name = </span><span class="s">&#34;D&#34;</span><span class="cp">]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Dump</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="n">T</span>: <span class="nc">Debug</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">default</span> <span class="k">fn</span><span class="w"> </span><span class="n">display</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">debug</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">default</span> <span class="k">fn</span><span class="w"> </span><span class="n">debug</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;</span><span class="si">{:?}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now, maybe I&rsquo;d like to specialize this impl so that if I have an
iterator over items that also implement <code>Display</code>, then <code>display</code> dumps
out their debug instead. I don&rsquo;t want to change the behavior for
<code>debug</code>, so I leave that method unchanged. This is sort of analogous
to subtyping in an OO language: I am <strong>refining</strong> the impl for
<code>Dump</code> by tweaking how it behaves in certain scenarios. We&rsquo;ll call
this impl E.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cp">#[name = </span><span class="s">&#34;E&#34;</span><span class="cp">]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="cp">#[specializes = </span><span class="s">&#34;D&#34;</span><span class="cp">]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Dump</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="n">T</span>: <span class="nc">Display</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">Debug</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">display</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;</span><span class="si">{}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="n">value</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>So far, everything is fine. In fact, if you just remove the <code>#[name]</code>
and <code>#[specializes]</code> annotations, this example would work with
specialization as currently implemented. <strong>But imagine that we did a
slightly different thing.</strong> Imagine we wrote impl E but <strong>without</strong>
the requirement that <code>T: Debug</code> (everything else is the same). Let&rsquo;s
call this variant impl F.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cp">#[name = </span><span class="s">&#34;F&#34;</span><span class="cp">]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="cp">#[specializes = </span><span class="s">&#34;D&#34;</span><span class="cp">]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Dump</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="n">T</span>: <span class="nc">Display</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">display</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;</span><span class="si">{}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="n">value</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now we no longer have the &ldquo;subset of types&rdquo; property. Because of the
<code>#[specializes]</code> annotation, impl F specializes impl D, but in fact it
applies to an overlapping, but different set of types (those that
implement <code>Display</code> rather than those that implement <code>Debug</code>).</p>
<p><strong>But losing the &ldquo;subset of types&rdquo; property makes the reuse in impl F
invalid.</strong> Impl F only defines the <code>display()</code> method and it claims to
inherit the <code>debug()</code> method from Impl D. But how can it do that?  The
code in impl D was written under the assumption that the types we are
iterating over implement <code>Debug</code>, and it uses methods from the <code>Debug</code>
trait. Clearly we can&rsquo;t reuse that code, since if we did so we might
not have the methods we need.</p>
<p>So the takeaway here is that <strong>if an impl A wants to reuse some items
from impl B, then impl A must apply to a subset of impl B&rsquo;s types</strong>.
That guarantees that the item from impl B will still be well-typed
inside of impl A.</p>
<h3 id="what-does-this-mean-for-copy-and-clone">What does this mean for copy and clone?</h3>
<p>&ldquo;Interesting thought experiment,&rdquo; you are thinking, &ldquo;but how does this
relate to <code>Copy</code> and <code>Clone</code>?&rdquo; Well, it turns out that if we ever want
to be able to add add things like an autoconversion impl between
<code>Copy</code> and <code>Clone</code> (and <code>Ord</code> and <code>PartialOrd</code>, etc), we are going to
have to move away from &ldquo;subsets of types&rdquo; as the sole basis for
specialization. <strong>This implies we will have to separate the concept of
&ldquo;when you can reuse&rdquo; (which requires subset of types) from &ldquo;when you
can override&rdquo; (which can be more general).</strong></p>
<p>Basically, in order to add a blanket impl backwards compatibly, we
<strong>have</strong> to allow impls to override one another in situations where
reuse would not be possible. Let&rsquo;s go through an example. Imagine that
&ndash; at timestep 0 &ndash; the <code>Dump</code> trait was defined in a crate <code>dump</code>,
but without any blanket impl:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// In crate `dump`, timestep 0
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Dump</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">display</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">debug</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now some other crate <code>widget</code> implements <code>Dump</code> for its type <code>Widget</code>,
at timestep 1:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// In crate `widget`, timestep 1
</span></span></span><span class="line"><span class="cl"><span class="k">extern</span><span class="w"> </span><span class="k">crate</span><span class="w"> </span><span class="n">dump</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Widget</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// impl G:
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">Debug</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Debug</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Widget</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="o">..</span><span class="p">.}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// impl H:
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Dump</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Widget</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">display</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="o">..</span><span class="p">.}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">debug</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="o">..</span><span class="p">.}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now, at timestep 2, we wish to add an implementation of <code>Dump</code>
that works for any type that implements <code>Debug</code> (as before):</p>
<pre tabindex="0"><code>// In crate `dump`, timestep 2
impl&lt;T&gt; Dump // impl I
    where T: Debug,
{
    default fn display(&amp;self) {
        self.debug()
    }
    
    default fn debug(&amp;self) {
        println!(&#34;{:?}&#34;, self);
    }
}
</code></pre><p><strong>If we assume that this set of impls will be accepted &ndash; somehow,
under any rules &ndash; we have created a scenario very similar to our
explicit specialization.</strong> Remember that we said in the beginning
that, for backwards compatibility, we need to make it so that adding
the new blanket impl (impl I) does not cause any existing code to
change what impl it is using. That means that <code>Widget&lt;T&gt;: Dump</code> also
needs to be resolved to impl H, the original impl from the crate
<code>widget</code>: even if impl I also applies.</p>
<p>This basically means that impl H <strong>overrides</strong> impl I (that is, in
cases where both impls apply, impl H takes precedence). But impl H
<strong>cannot reuse</strong> from impl I, since impl H does not apply to a subset
of blanket impl&rsquo;s types. Rather, these impls apply to overlapping but
distinct sets of types. For example, the <code>Widget</code> impl applies to all
<code>Widget&lt;T&gt;</code>, even in cases where <code>T: Debug</code> does not hold. But the
blanket impl applies to <code>i32</code>, which is not a widget at all.</p>
<h3 id="conclusion">Conclusion</h3>
<p>This blog post argues that if we want to support adding blanket impls
backwards compatibly, we have to be careful about reuse. I actually
don&rsquo;t think this is a mega-big deal, but it&rsquo;s an interesting
observation, and one that wasn&rsquo;t obvious to me at first. It means that
&ldquo;subset of types&rdquo; will always remain a relevant criteria that we have
to test for, no matter what rules we wind up with (which might in turn
mean that intersection impls remain relevant).</p>
<p>The way I see this playing out is that we have some rules for when one
impl specializes one another. Those rules do not guarantee a subset of
types and in fact the impls may merely overlap. If, <strong>additionally</strong>,
one impl matches a subst of the other&rsquo;s types, then that first impl
may reuse items from the other impl.</p>
<h3 id="ps-why-not-use-names-anyway">PS: Why <strong>not</strong> use names, anyway?</h3>
<p>You might be thinking to yourself right now &ldquo;boy, it is nice to have
names and be able to say explicitly what we specialized by what&rdquo;. And
I would agree. In fact, since &ldquo;specializable&rdquo; impls must mark their
items as default, you could easily imagine a scheme where those impls
had to also be given a name at the same time. Unfortunately, that
would not at all support my copy-clone use case, since in that case we
want to add the base impl after the fact, and hence the extant
specializing impls would have to be modified to add a <code>#[specializes]</code>
annotation. Also, we tried giving impls names back in the day; it felt
quite artificial, since they don&rsquo;t have an identity of their own,
really.</p>
<h3 id="comments">Comments</h3>
<p>Since this is a continuation of my <a href="https://smallcultfollowing.com/babysteps/
/blog/2016/09/24/intersection-impls/">previous post</a>, I&rsquo;ll just
re-use the
<a href="https://internals.rust-lang.org/t/blog-post-intersection-impls/4129/">same internals thread</a>
for comments.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/specialization" term="specialization" label="Specialization"/><category scheme="https://smallcultfollowing.com/babysteps/categories/traits" term="traits" label="Traits"/></entry><entry><title type="html">Intersection Impls</title><link href="https://smallcultfollowing.com/babysteps/blog/2016/09/24/intersection-impls/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2016/09/24/intersection-impls/</id><published>2016-09-24T00:00:00+00:00</published><updated>2016-09-24T06:07:31-04:00</updated><content type="html"><![CDATA[<p>As some of you are probably aware, on the nightly Rust builds, we
currently offer a feature called <strong>specialization</strong>, which was defined
in <a href="https://github.com/rust-lang/rfcs/pull/1210">RFC 1210</a>. The idea of specialization is to improve Rust&rsquo;s
existing coherence rules to allow for overlap between impls, so long
as one of the overlapping impls can be considered <em>more
specific</em>. Specialization is hotly desired because it can enable
powerful optimizations, but also because it is an important component
for <a href="http://aturon.github.io/blog/2015/09/18/reuse/">modeling object-oriented designs</a>.</p>
<p>The current specialization design, while powerful, is also limited in
a few ways. I am going to work on a series of articles that explore
some of those limitations as well as possible solutions.</p>
<p>This particular posts serves two purposes: it describes the running
example I want to consder, and it describes one possible solution:
<strong>intersection impls</strong> (more commonly called &ldquo;lattice impls&rdquo;). We&rsquo;ll
see that intersection impls are a powerful feature, but they don&rsquo;t
completely solve the problem I am aiming to solve and they also
intoduce other complications. My conclusion is that they may be a part
of the final solution, but are not sufficient on their own.</p>
<!-- more -->
<h3 id="running-example-interconverting-between-copy-and-clone">Running example: interconverting between <code>Copy</code> and <code>Clone</code></h3>
<p>I&rsquo;m going to structure my posts around a detailed look at the <code>Copy</code>
and <code>Clone</code> traits, and in particular about how we could use
specialization to bridge between the two. These two traits are used in
Rust to define how values can be duplicated. The idea is roughly like
this:</p>
<ul>
<li>A type is <code>Copy</code> if it can be copied from one place to another just
by copying bytes (i.e., with <code>memcpy</code>). This is basically types that
consist purely of scalar values (e.g., <code>u32</code>, <code>[u32; 4]</code>, etc).</li>
<li>The <code>Clone</code> trait expands upon <code>Copy</code> to include all types that can
be copied at all, even if requires executing custom code or allocating
memory (for example, a <code>String</code> or <code>Vec&lt;u32&gt;</code>).</li>
</ul>
<p>These two traits are clearly <em>related</em>. In fact, <code>Clone</code> is a
<em>supertrait</em> of <code>Copy</code>, which means that every type that is copyable
must also be cloneable.</p>
<p>For better or worse, supertraits in Rust work a bit differently than
<em>superclasses</em> from OO languages. In particular, the two traits are
still independent from one another. This means that if you want to
declare a type to be <code>Copy</code>, you must also supply a <code>Clone</code> impl.
Most of the time, we do that with a <code>#[derive]</code> annotation, which
auto-generates the impls for you:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cp">#[derive(Copy, Clone, ...)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Point</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">x</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">y</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>That <code>derive</code> annotation will expand out to two impls looking
roughly like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Point</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">x</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">y</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="nb">Copy</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Point</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Copy has no methods; it can also be seen as a &#34;marker&#34;
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// that indicates that a cloneable type can also be
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// memcopy&#39;d.
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="nb">Clone</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Point</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">clone</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Point</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">*</span><span class="bp">self</span><span class="w"> </span><span class="c1">// this will just do a memcpy
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The second impl (the one implementing the <code>Clone</code> trait) seems a bit
odd. After all, that impl is written for <code>Point</code>, but in principle it
could be used <em>any</em> <code>Copy</code> type. It would be nice if we could add a
blanket impl that converts from <code>Copy</code> to <code>Clone</code> that applies to all
<code>Copy</code> types:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// Hypothetical addition to the standard library:
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>:<span class="nb">Copy</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Clone</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">clone</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Point</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">*</span><span class="bp">self</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>If we had such an impl, then there would be no need for <code>Point</code> above
to implement <code>Clone</code> explicitly, since it implements <code>Copy</code>, and the
blanket impl can be used to supkply the <code>Clone</code> impl. (In other words,
you could just write <code>#[derive(Copy)]</code>.) As you have probably
surmised, though, it&rsquo;s not that simple. Adding a blanket impl like
this has a few complications we&rsquo;d have to overcome first. This is
still true with the specialization system described in [RFC 1210][].</p>
<p>There are a number of examples where these kinds of blanket impls
might be useful. Some examples: implementing <code>PartialOrd</code> in terms of
<code>Ord</code>, implementing <code>PartialEq</code> in terms of <code>Eq</code>, and implementing
<code>Debug</code> in terms of <code>Display</code>.</p>
<h3 id="coherence-and-backwards-compatibility">Coherence and backwards compatibility</h3>
<p><img src="/images/Troymcclure.png" style="float:left; height:285px;"></img></p>
<p><em>Hi! I&rsquo;m the language feature coherence! You may remember me from
previous essays like <a href="http://smallcultfollowing.com/babysteps/blog/2015/01/14/little-orphan-impls/">Little Orphan Impls</a> or <a href="https://github.com/rust-lang/rfcs/pull/1023">RFC 1023</a>.</em></p>
<p>Let&rsquo;s take a step back and just think about the language as it is now,
without specialization. With today&rsquo;s Rust, adding a blanket
<code>impl&lt;T:Copy&gt; Clone for T</code> would be massively backwards incompatible.
This is because of the coherence rules, which aim to prevent there
from being more than one trait applicable to any type (or, for generic
traits, set of types).</p>
<div style="clear:both"></div>
<p>So, if we tried to add the blanket impl now, without specialization,
it would mean that every type annotated with <code>#[derive(Copy, Clone)]</code>
would stop compiling, because we would now have two clone impls: one
from derive and the blanket impl we are adding. Obviously not
feasible.</p>
<h3 id="why-didnt-we-add-this-blanket-impl-already-then">Why didn&rsquo;t we add this blanket impl already then?</h3>
<p>You might then wonder why we didn&rsquo;t add this blanket impl converting from
<code>Copy</code> to <code>Clone</code> in the &ldquo;wild west&rdquo; days, when we broke every
existing Rust crate on a regular basis. We certainly considered
it. The answer is that, if you have such an impl, the coherence rules
mean that it would not work well with generic types.</p>
<p>To see what problems arise, consider the type <code>Option</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="cp">#[derive(Copy, Clone)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">enum</span> <span class="nb">Option</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nb">Some</span><span class="p">(</span><span class="n">T</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nb">None</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>You can see that <code>Option&lt;T&gt;</code> derives <code>Copy</code> and <code>Clone</code>. But because
<code>Option</code> is generic for <code>T</code>, those impls have a slightly different
look to them once we expand them out:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>:<span class="nb">Copy</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Copy</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>:<span class="nb">Clone</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Clone</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">clone</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">match</span><span class="w"> </span><span class="o">*</span><span class="bp">self</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nb">Some</span><span class="p">(</span><span class="k">ref</span><span class="w"> </span><span class="n">v</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">v</span><span class="p">.</span><span class="n">clone</span><span class="p">()),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nb">None</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="nb">None</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Before, the <code>Clone</code> impl for <code>Point</code> was just <code>*self</code>. But for
<code>Option&lt;T&gt;</code>, we have to do something more complicated, which actually
calls <code>clone</code> on the contained value (in the case of a <code>Some</code>). To see
why, imagine a type like <code>Option&lt;Rc&lt;u32&gt;&gt;</code> &ndash; this is clearly
cloneable, but it is not <code>Copy</code>.  So the impl is rewritten so that it
only assumes that <code>T: Clone</code>, not <code>T: Copy</code>.</p>
<p>The problem is that types like <code>Option&lt;T&gt;</code> are <em>sometimes</em> <code>Copy</code> and
sometimes not. So if we had the blanket impl that converts all <code>Copy</code>
types to <code>Clone</code>, and we have the impl above that impl <code>Clone</code> for
<code>Option&lt;T&gt;</code> if <code>T: Clone</code>, then we can easily wind up in a situation
where there are two applicable impls. For example, consider
<code>Option&lt;u32&gt;</code>: it is <code>Copy</code>, and hence we could use the blanket impl
that just returns <code>*self</code>. But it is also fits the <code>Clone</code>-based impl
I showed above. This is a <strong>coherence violation</strong>, because now the
compiler has to pick which impl to use. Obviously, in the case of the
trait <code>Clone</code>, it shouldn&rsquo;t matter too much which one it chooses,
since they both have the same effect, but the compiler doesn&rsquo;t know
that.</p>
<h3 id="enter-specialization">Enter specialization</h3>
<p>OK, all of that prior discussion was assuming the Rust of today.  So
what if we adopted the existing <a href="https://github.com/rust-lang/rfcs/pull/1210">specialization RFC</a>?  After
all, its whole purpose is to improve coherence so that it is possible
to have multiple impls of a trait for the same type, so long as one of
those implementations is <em>more specific</em>. Maybe that applies here?</p>
<p>In fact, the RFC as written today <strong>does not</strong>. The reason is that the
RFC defines rules that say an impl A is more specific than another
impl B if impl A applies to a <strong>strict subset</strong> of the types which
impl B applies to. Let&rsquo;s consider some arbitrary trait <code>Foo</code>.  Imagine
that we have an impl of <code>Foo</code> that applies to any <code>Option&lt;T&gt;</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The &ldquo;more specific&rdquo; rule would then allow a second impl for
<code>Option&lt;i32&gt;</code>; this impl would specialize the more generic one:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Foo</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Option</span><span class="o">&lt;</span><span class="kt">i32</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Here, the second impl is more specific than the first, because while
the first impl can be used for <code>Option&lt;i32&gt;</code>, it can also be used for
lots of other types, like <code>Option&lt;u32&gt;</code>, <code>Option&lt;i64&gt;</code>, etc. So that
means that these two impls would be <strong>accepted</strong> under
<a href="https://github.com/rust-lang/rfcs/pull/1210">RFC #1210</a>. If the compiler ever had to choose between them, it
would prefer the impl that is specific to <code>Option&lt;i32&gt;</code> over the
generic one that works for all <code>T</code>.</p>
<p>But if we try to apply that rule to our two <code>Clone</code> impls, we run into
a problem. First, we have the blanket impl:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>:<span class="nb">Copy</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Clone</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>and then we have an impl tailored to <code>Option&lt;T&gt;</code> where <code>T: Clone</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>:<span class="nb">Clone</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Clone</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now, you might think that the second impl is more specific than the
blanket impl. After all, it can be used for any type, whereas the
second impl can only be used <code>Option&lt;T&gt;</code>.  Unfortunately, this isn&rsquo;t
quite right. After all, the blanket impl cannot be used for <em>any</em> type
<code>T</code>: it can only be used for <code>Copy</code> types. And we already saw that
there are lots of types for which the second impl can be used where
the first impl is inapplicable. In other words, neither impl is a
subset of one another &ndash; rather, they both cover two distinct, but
overlapping, sets of types.</p>
<p>To see what I mean, let&rsquo;s look at some examples:</p>
<pre tabindex="0"><code>| Type              | Blanket impl | `Option` impl |
| ----              | ------------ | ------------- |
| i32               | APPLIES      | inapplicable  |
| Box&lt;i32&gt;          | inapplicable | inapplicable  |
| Option&lt;i32&gt;       | APPLIES      | APPLIES       |
| Option&lt;Box&lt;i32&gt;&gt;  | inapplicable | APPLIES       |
</code></pre><p>Note in particular the first and fourth rows. The first row shows that
the blanket impl is not a subset of the <code>Option</code> impl.  The last row
shows that the <code>Option</code> impl is not a subset of the blanket impl
either. That means that these two impls would be <strong>rejected</strong> by
<a href="https://github.com/rust-lang/rfcs/pull/1210">RFC #1210</a> and hence adding a blanket impl now would <em>still</em> be
a breaking change. Boo!</p>
<p>To see the problem from another angle, consider this Venn digram,
which indicates, for every impl, the sets of types that it matches.
As you can see, there is overlap between our two impls, but neither is
a strict subset of one another:</p>
<pre tabindex="0"><code>+-----------------------------------------+
|[impl&lt;T:Copy&gt; Clone for T]               |
|                                         |
| Example: i32                            |
| +---------------------------------------+-----+
| |                                       |     |
| | Example: Option&lt;i32&gt;                  |     |
| |                                       |     |
+-+---------------------------------------+     |
  |                                             |
  |   Example: Option&lt;Box&lt;i32&gt;&gt;                 |
  |                                             |
  |          [impl&lt;T:Clone&gt; Clone for Option&lt;T&gt;]|
  +---------------------------------------------+
</code></pre><h3 id="enter-intersection-impls">Enter intersection impls</h3>
<p>One of the first ideas proposed for solving this is the so-called
&ldquo;lattice&rdquo; specialization rule, which I will call &ldquo;intersection&rdquo; impls,
since I think that captures the spirit better. The intuition is pretty
simple: if you have two impls that have a partial intersection, but
which don&rsquo;t strictly subset one another, then you can add a third impl
that covers <em>precisely</em> that intersection, and hence which subsets
both of them. So now, for any type, there is always a &ldquo;most specific&rdquo;
impl to choose. To get the idea, it may help to consider this &ldquo;ASCII
Art&rdquo; Venn diagram. Note the difference from above: there is now an
impl (indicating with <code>=</code> lines and <code>.</code> shading) covering precisely
the intersection of the other two.</p>
<pre tabindex="0"><code>+-----------------------------------------+
|[impl&lt;T:Copy&gt; Clone for T]               |
|                                         |
| Example: i32                            |
| +=======================================+-----+
| |[impl&lt;T:Copy&gt; Clone for Option&lt;T&gt;].....|     |
| |.......................................|     |
| |.Example: Option&lt;i32&gt;..................|     |
| |.......................................|     |
+-+=======================================+     |
  |                                             |
  |   Example: Option&lt;Box&lt;i32&gt;&gt;                 |
  |                                             |
  |          [impl&lt;T:Clone&gt; Clone for Option&lt;T&gt;]|
  +---------------------------------------------+
</code></pre><p>Intersection impls have some nice properties. For one thing, it&rsquo;s a
kind of minimal extension of the existing rule. In particular, if you
are just looking at any two impls, the rules for deciding which is
more specific are unchanged: the only difference when adding in
intersection impls is that coherence permits overlap when it otherwise
wouldn&rsquo;t.</p>
<p>They also give us a good opportunity to recover some
optimization. Consider the two impls in this case: the &ldquo;blanket&rdquo; impl
that applies to any <code>T: Copy</code> simply copies some bytes around, which
is very fast. The impl that is tailed to <code>Option&lt;T&gt;</code>, however, does
more work: it matches the impl and then recursively calls
<code>clone</code>. This work is necessary if <code>T: Copy</code> does not hold, but
otherwise it&rsquo;s wasted work.  With an intersection impl, we can recover
the full performance:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// intersection impl:
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>:<span class="nb">Copy</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Clone</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">clone</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">*</span><span class="bp">self</span><span class="w"> </span><span class="c1">// since T: Copy, we can do this here
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="a-note-on-compiler-messages">A note on compiler messages</h3>
<p>I&rsquo;m about to pivot and discuss the shortcomings of intersection
impls. But before I do so, I want to talk a bit about the compiler
messages here. I think that the core idea of specialization &ndash; that
you want to pick the impl that applies to the <strong>most specific</strong> set of
types &ndash; is fairly intuitive. But working it out in practice can be
kind of confusing, especially at first. So whenever we propose any
extension, we have to think carefully about the error messages that
might result.</p>
<p>In this particular case, I think that we could give a rather nice error
message. Imagine that the user had written these two impls:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Copy</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Clone</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="c1">// impl A
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">clone</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">T</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nb">Clone</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Clone</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="c1">// impl B
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">clone</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">    
</span></span></span></code></pre></div><p>As we&rsquo;ve seen, these two impls overlap but neither specializes the
other. One might imagine an error message that says as much, and
which also suggests the intersection impl that must be added:</p>
<pre tabindex="0"><code>error: two impls overlap, but neither specializes the other
  |
2 | impl&lt;T: Copy&gt; Clone for T {...}
  | ----
  |
4 | impl&lt;T: Clone&gt; Clone for Option&lt;T&gt; {...}
  |
  | note: both impls apply to a type like `Option&lt;T&gt;` where `T: Copy`;
  |       to specify the behavior in this case, add the following intersection impl:
  |       `impl&lt;T: Copy&gt; Clone for Option&lt;T&gt;`
</code></pre><p>Note the message at the end. The wording could no doubt be improved,
but the key point is that we should be to actually tell you <strong>exactly
what impl is still needed</strong>.</p>
<h3 id="intersection-impls-do-not-solve-the-cross-crate-problem">Intersection impls do not solve the cross-crate problem</h3>
<p>Unfortunately, intersection impls don&rsquo;t give us the backwards
compatibility that we want, at least not by themselves. The problem
is, if we add the blanket impl, we <em>also</em> have to add the intersection
impl. <strong>Within the same crate, this might be ok. But if this means that
downstream crates have to add an intersection impl too, that&rsquo;s a big
problem.</strong></p>
<h3 id="intersection-impls-may-force-you-to-predict-the-future">Intersection impls may force you to predict the future</h3>
<p>There is one other problem with intersection impls that arises in
cross-crate situations, which
<a href="https://github.com/rust-lang/rust/issues/31844#issuecomment-247867693">nrc described on the tracking issue</a>: sometimes there is a
<em>theoretical</em> intersection between impls, but that intersection is
empty in practice, and hence you may not be able to write the code you
wanted to write. Let me give you an example. This problem doesn&rsquo;t show
up with the <code>Copy</code>/<code>Clone</code> trait, so we&rsquo;ll switch briefly to another
example.</p>
<p>Imagine that we are adding a <code>RichDisplay</code> trait to our project. This
is much like the existing <a href="https://doc.rust-lang.org/std/fmt/trait.Display.html"><code>Display</code></a> trait, except that it
can support richer formatting like ANSI codes or a GUI. For
convenience, we want any type that implements <code>Display</code> to also
implement <code>RichDisplay</code> (but without any fancy formatting). So we add
a trait and blanket impl like this one (let&rsquo;s call it impl A):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">RichDisplay</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="cm">/* elided */</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">D</span>: <span class="nc">Display</span><span class="o">&gt;</span><span class="w"> </span><span class="n">RichDisplay</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">D</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="cm">/* elided */</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="c1">// impl A
</span></span></span></code></pre></div><p>Now, imagine that we are also using some other crate <code>widget</code> that
contains various types, including <code>Widget&lt;T&gt;</code>. This <code>Widget&lt;T&gt;</code> type
does not implement <code>Display</code>. But we would like to be able to render a
widget, so we implement <code>RichDisplay</code> for this <code>Widget&lt;T&gt;</code> type. Even
though we didn&rsquo;t define <code>Widget&lt;T&gt;</code>, we can implement a trait for it
because we defined the trait:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">RichDisplay</span><span class="o">&gt;</span><span class="w"> </span><span class="n">RichDisplay</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Widget</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="c1">// impl B
</span></span></span></code></pre></div><p>Well, now we have a problem! You see, according to the rules from
<a href="https://github.com/rust-lang/rfcs/pull/1023">RFC 1023</a>, impls A and B are considered to <em>potentially</em> overlap,
and hence we will get an error. This might surprise you: after all,
impl A only applies to types that implement <code>Display</code>, and we said
that <code>Widget&lt;T&gt;</code> does not. The problem has to do with semver: because
<code>Widget&lt;T&gt;</code> was defined in another crate, it is outside of our
control. In this case, the other crate is allowed to implement
<code>Display</code> for <code>Widget&lt;T&gt;</code> at some later time, and that should not be a
breaking change. But imagine that this other crate added an impl like
this one (which we can call impl C):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">Display</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Display</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Widget</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="c1">// impl C
</span></span></span></code></pre></div><p>Such an impl would cause impls A and B to overlap. Therefore,
coherence considers these to be overlapping &ndash; however, specialization
does not consider impl B to be a specialization of impl A, because, at
the moment, there is no subset relationship between them. <strong>So there
is a kind of catch-22 here: because the impl may exist in the future,
we can&rsquo;t consider the two impls disjoint, but because it doesn&rsquo;t exist
right now, we can&rsquo;t consider them to be specializations.</strong></p>
<p>Clearly, intersection impls don&rsquo;t help to address this issue, as the
set of intersecting types is empty. You might imagine having some
alternative extension to coherence that permits impl B on the logic of
&ldquo;if impl C were added in the future, that&rsquo;d be fine, because impl B
would be a specialization of impl A&rdquo;.</p>
<p>This logic is pretty dubious, though! For example, impl C might have
been written another way (we&rsquo;ll call this alternative version of impl C &ldquo;impl C2&rdquo;):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">WidgetDisplay</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Display</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Widget</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="c1">// impl C2
</span></span></span><span class="line"><span class="cl"><span class="c1">//   ^^^^^^^^^^^^^^^^ changed this bound
</span></span></span></code></pre></div><p>Note that instead of working for any <code>T: Display</code>, there is now some
other trait <code>T: WidgetDisplay</code> in use. Let&rsquo;s say it&rsquo;s only implemented
for optional 32-bit integers right now (for some reason or another):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">WidgetDisplay</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">WidgetDisplay</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Option</span><span class="o">&lt;</span><span class="kt">i32</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>So now if we had impls A, B, and C2, we would have a different
problem. Now impls A and B would overlap for <code>Widget&lt;Option&lt;i32&gt;&gt;</code>,
but they would not overlap for <code>Widget&lt;String&gt;</code>. The reason here is
that <code>Option&lt;i32&gt;: WidgetDisplay</code>, and hence impl A applies. But
<code>String: RichDisplay</code> (because <code>String: Display</code>) and hence impl B
applies. Now we are back in the territory where intersection impls
come into play. So, again, <strong>if we had impls A, B, and C2</strong>, one could
imagine writing an intersection impl to cover this situation:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>: <span class="nc">RichDisplay</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">WidgetDisplay</span><span class="o">&gt;</span><span class="w"> </span><span class="n">RichDisplay</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Widget</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="c1">// impl D
</span></span></span></code></pre></div><p>But, of course, <strong>impl C2 has yet to be written</strong>, so we can&rsquo;t really
write this intersection impl <strong>now</strong>, in advance. We have to wait
until the conflict arises before we can write it.</p>
<p>You may have noticed that I was careful to specify that both the
<code>Display</code> trait and <code>Widget</code> type were defined outside of the current
crate. This is because <a href="https://github.com/rust-lang/rfcs/pull/1023">RFC 1023</a> permits the use of &ldquo;negative
reasoning&rdquo; <strong>if either the trait or the type is under local
control</strong>. That is, if the <code>RichDisplay</code> and the <code>Widget</code> type were
defined in the <em>same</em> crate, then impls A and B could co-exist,
because we are allowed to rely on the fact that <code>Widget</code> does not
implement <code>Display</code>. The idea here is that the only way that <code>Widget</code>
could implement <code>Display</code> is if I modify the crate where <code>Widget</code> is
defined, and once I am modifying things, I can also make any other
repairs (such as adding an intersection impl) that are necessary.</p>
<h3 id="conclusion">Conclusion</h3>
<p>Today we looked at a particular potential use for specialization:
adding a blanket impl that implements <code>Clone</code> for any <code>Copy</code> type. We
saw that the current &ldquo;subset-only&rdquo; logic for specialization isn&rsquo;t
enough to permit adding such an impl. We then looked at one proposed
fix for this, intersection impls (often called lattice
impls).</p>
<p>Intersection impls are appealing because they increase expressiveness
while keeping the general feel of the &ldquo;subset-only&rdquo; logic. They also
have an &ldquo;explicit&rdquo; nature that appeals to me, at least in
principle. That is, if you have two impls that partially overlap, the
compiler doesn&rsquo;t select which one should win: instead, you write an
impl to cover precisely that intersection, and hence specify it
yourself. Of course, that explicit nature can also be verbose and
irritating sometimes, particularly since you will often want the
&ldquo;intersection impl&rdquo; to behave the same as one of the other two (rather
than doing some third, different thing).</p>
<p>Moreover, the explicit nature of interseciton impls causes problems
across crates:</p>
<ul>
<li>they don&rsquo;t allow you to add a blanket impl in a backwards compatible
fashion;</li>
<li>they interact poorly with semver, and specifically the limitations
on negative logic imposed by <a href="https://github.com/rust-lang/rfcs/pull/1023">RFC 1023</a>.</li>
</ul>
<p>My conclusion then is that intersection impls may well be <em>part</em> of
the solution we want, but we will need additional mechanisms. Stay
tuned for additional posts.</p>
<h3 id="a-note-on-comments">A note on comments</h3>
<p>As is my wont, I am going to close this post for comments. If you
would like to leave a comment, please go to this
<a href="https://internals.rust-lang.org/t/blog-post-intersection-impls/4129">thread on Rust&rsquo;s internals forum</a> instead.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/specialization" term="specialization" label="Specialization"/><category scheme="https://smallcultfollowing.com/babysteps/categories/traits" term="traits" label="Traits"/></entry><entry><title type="html">Thoughts on trusting types and unsafe code</title><link href="https://smallcultfollowing.com/babysteps/blog/2016/09/12/thoughts-on-trusting-types-and-unsafe-code/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2016/09/12/thoughts-on-trusting-types-and-unsafe-code/</id><published>2016-09-12T00:00:00+00:00</published><updated>2016-09-12T05:39:52-04:00</updated><content type="html"><![CDATA[<p>I&rsquo;ve been thinking about the unsafe code guidelines a lot in the back
of my mind. In particular, I&rsquo;ve been trying to think through what it
means to &ldquo;trust types&rdquo; &ndash; if you recall from the
<a href="http://smallcultfollowing.com/babysteps/blog/2016/05/27/the-tootsie-pop-model-for-unsafe-code/">Tootsie Pop Model</a> (TPM) blog post, one of the <em>key</em> examples
that I was wrestling with was the <code>RefCell-Ref</code> example. I want to
revisit a variation on that example now, but from a different
angle. (This by the way is one of those &ldquo;Niko thinks out loud&rdquo; blog
posts, not one of those &ldquo;Niko writes up a proposal&rdquo; blog posts.)</p>
<!-- more -->
<h4 id="setup">Setup</h4>
<p>Let&rsquo;s start with a little safe function:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">patsy</span><span class="p">(</span><span class="n">v</span>: <span class="kp">&amp;</span><span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">usize</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">l</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">*</span><span class="n">v</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">collaborator</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">use</span><span class="p">(</span><span class="n">l</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The question is, should the compiler ever be able to optimize this
function as follows:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">patsy</span><span class="p">(</span><span class="n">v</span>: <span class="kp">&amp;</span><span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">usize</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">collaborator</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">use</span><span class="p">(</span><span class="o">*</span><span class="n">v</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>By moving the load from <code>v</code> after the call to <code>collaborator()</code>, we
avoided the need for a temporary variable. This might reduce stack
size or register pressure. It is also an example of the kind of
optimizations we are considering doing for MIR (you can think of it as
an aggressive form of copy-propagation). <strong>In case it&rsquo;s not clear, I
really want the answer to this question be yes &ndash; at least most of the
time.</strong> More specifically, I am interested in examining when we can do
this <strong>without doing any interprocedural analysis</strong>.</p>
<p>Now, the question of &ldquo;is this legal?&rdquo; is not necessarily a yes or no
question. For example, the Tootsie Pop Model answer was &ldquo;it
depends&rdquo;. In a safe code context, this transformation was legal. In an
unsafe context, it was not.</p>
<h4 id="what-could-go-wrong">What could go wrong?</h4>
<p>The concern here is that the function <code>collaborator()</code> might invalidate <code>*v</code> in
some way.  There are two ways that this could potentially happen:</p>
<ul>
<li>unsafe code could mutate <code>*v</code>,</li>
<li>unsafe code could invalidate the memory that <code>v</code> refers to.</li>
</ul>
<p>Here is some unsafe code that does the first thing:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">static</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">data</span>: <span class="kt">usize</span> <span class="o">=</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">instigator</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">patsy</span><span class="p">(</span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">&amp;</span><span class="n">data</span><span class="w"> </span><span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">collaborator</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Here is some unsafe code that invalidates <code>*v</code> using an option (you
can also write code that makes it get freed, of course). Here, when we
start, <code>data</code> is <code>Some(22)</code>, and we take a reference to that <code>22</code>. But
then <code>collaborator()</code> reassigns data to <code>None</code>, and hence the memory
that we were referring to is now uninitialized.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">static</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">data</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="kt">usize</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="mi">22</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">instigator</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">patsy</span><span class="p">(</span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">data</span><span class="p">.</span><span class="n">as_ref</span><span class="p">().</span><span class="n">unwrap</span><span class="p">()</span><span class="w"> </span><span class="p">})</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">collaborator</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">None</span><span class="p">;</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>So, when we ask whether it is legal to optimize <code>patsy</code> move the <code>*v</code>
load after the call to <code>collaborator()</code>, our answer affects whether
this unsafe code is legal.</p>
<h4 id="the-tootsie-pop-model">The Tootsie Pop Model</h4>
<p>Just for fun, let&rsquo;s look at how this plays out in the Tootsie Pop
model (TPM). As I wrote before, whether this code is legal will
ultimately depend on whether <code>patsy</code> is located in an unsafe
context. The way I described the model, unsafe contexs are tied to
modules, so I&rsquo;ll stick with that, but there might also be other ways
of defining what an unsafe context is.</p>
<p>First let&rsquo;s imagine that all three functions are in the same module:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">mod</span> <span class="nn">foo</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">static</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">data</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="kt">usize</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="mi">22</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">instigator</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="o">..</span><span class="p">.}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">patsy</span><span class="p">(</span><span class="n">v</span>: <span class="kp">&amp;</span><span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">usize</span> <span class="p">{</span><span class="o">..</span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">fn</span> <span class="nf">collaborator</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="o">..</span><span class="p">.}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Here, because <code>instigator</code> and <code>collaborator</code> contain unsafe blocks,
the module <code>foo</code> is considered to be an unsafe context, and thus
<code>patsy</code> is also located within the unsafe context. This means that the
unsafe code would be legal and the optimization would not. This is
because the TPM does not allow us to &ldquo;trust types&rdquo; within an unsafe
context.</p>
<p><strong>However,</strong> it&rsquo;s worth pointing out one other interesting
detail. Just because the TPM model does not authorize the
optimization, that doesn&rsquo;t mean that it could not be performed. It
just means that to perform the optimization would require detailed
interprocedural alias analysis. That is, a highly optimizing compile
might analyze <code>instigator</code>, <code>patsy</code>, and <code>collaborator</code> and determine
whether or not the writes in <code>collaborator</code> can affect <code>patsy</code> (of
course here they can, but in more reasonable code they likely would
not). Put another way, the TPM basically tells you &ldquo;here are
optimizations you can do without doing anything sophisticated&rdquo;; it
doesn&rsquo;t put an upper limit on what you can do given sufficient extra
analysis.</p>
<p>OK, so now here is another recasting where the functions are spread between
modules:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">mod</span> <span class="nn">foo</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">use</span><span class="w"> </span><span class="n">bar</span>::<span class="n">patsy</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">static</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">data</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="kt">usize</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="mi">22</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">instigator</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="o">..</span><span class="p">.}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">collaborator</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="o">..</span><span class="p">.}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">mod</span> <span class="nn">bar</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">use</span><span class="w"> </span><span class="n">foo</span>::<span class="n">collaborator</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">patsy</span><span class="p">(</span><span class="n">v</span>: <span class="kp">&amp;</span><span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">usize</span> <span class="p">{</span><span class="o">..</span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>In this case, the module <code>bar</code> does not contain <code>unsafe</code> blocks, and
hence it is not an unsafe context. That means that we <strong>can</strong> optimize
<code>patsy</code>. It <strong>also means</strong> that <code>instigator</code> is illegal:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">instigator</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">patsy</span><span class="p">(</span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">&amp;</span><span class="n">data</span><span class="w"> </span><span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The problem here is that <code>instigator</code> is calling <code>patsy</code>, which is
defined in a safe context (and hence must also be a safe
function). That implies that <code>instigator</code> must fulfill all of Rust&rsquo;s
basic permissions for the arguments that <code>patsy</code> expects. In this
case, the argument is a <code>&amp;usize</code>, which means that the <code>usize</code> must be
accessible <strong>and</strong> immutable for the entire lifetime of the reference;
that lifetime encloses the call to <code>patsy</code>. And yet the data in
question <strong>can</strong> be mutated (by <code>collaborator</code>). So <code>instigator</code> is
failing to live up to its obligations.</p>
<p>TPM has interesting implications for the Rust optimizer. Basically,
whether or not a given statement can &ldquo;trust&rdquo; the types of its
arguments ultimately depends on where it appeared in the original
source. This means we have to track some info when inlining unsafe
code into safe code (or else &rsquo;taint&rsquo; the safe code in some way). This
is not unique to TPM, though: Similar capabilities seem to be required
for handling e.g. the C99 <code>restrict</code> keyword, and we&rsquo;ll see that they
are also important when trusting types.</p>
<h4 id="what-if-we-fully-trusted-types-everywhere">What if we fully trusted types everywhere?</h4>
<p>Of course, the TPM has the downside that it hinders optimization in
<a href="http://smallcultfollowing.com/babysteps/blog/2016/08/18/tootsie-pop-followup/">unchecked-get</a> use case. I&rsquo;ve been pondering various ways to address
that. One thing that I find intuitively appealing is the idea of
trusting Rust types everywhere. For example, the idea might be that
<strong>whenever</strong> you create a shared reference like <code>&amp;usize</code>, you must
ensure that its associated permissions hold. If we took this approach,
then we could perform the optimization on <code>patsy</code>, and we could say
that <code>instigator</code> is illegal, for the same reasons that it was illegal
under TPM when <code>patsy</code> was in a distinct module.</p>
<p><strong>However, trusting types everywhere &ndash; even in unsafe code &ndash;
potentially interacts in a rather nasty way with lifetime inference.</strong>
Here is another example function to consider, <code>alloc_free</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">alloc_free</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// allocates and initializes an integer
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">p</span>: <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="kt">i32</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">allocate_an_integer</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// create a safe reference to `*p` and read from it
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">q</span>: <span class="kp">&amp;</span><span class="kt">i32</span> <span class="o">=</span><span class="w"> </span><span class="o">&amp;*</span><span class="n">p</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">r</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">*</span><span class="n">q</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// free `p`
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">free</span><span class="p">(</span><span class="n">p</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// use the value we loaded
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">use</span><span class="p">(</span><span class="n">r</span><span class="p">);</span><span class="w"> </span><span class="c1">// but could we move the load down to here?
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>What is happening here is that we allocate some memory containing an
integer, create a reference that refers to it, read from that
reference, and then free the original memory. We then use the value
that we read from the reference. The question is: can the compiler
&ldquo;copy-propagate&rdquo; that read down to the call to <code>use()</code>?</p>
<p>If this were C code, the answer would pretty clearly be <strong>no</strong> (I
presume, anyway). The compiler would see that <code>free(p)</code> may invalidate
<code>q</code> and hence it act as a kind of barrier.</p>
<p>But if we were to go &ldquo;all in&rdquo; on trusting Rust types, the answer would
be (<a href="http://smallcultfollowing.com/babysteps/blog/2016/04/27/non-lexical-lifetimes-introduction/">at least currently</a>) <strong>yes</strong>. Remember that the purpose of this
model is to let us do optimizations <strong>without</strong> doing fancy
analysis. Here what happens is that we create a reference <code>q</code> whose
lifetime will stretch from the point of creation until the end of its
scope:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">alloc_free</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">p</span>: <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="kt">i32</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">allocate_an_integer</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">q</span>: <span class="kp">&amp;</span><span class="kt">i32</span> <span class="o">=</span><span class="w"> </span><span class="o">&amp;*</span><span class="n">p</span><span class="p">;</span><span class="w"> </span><span class="c1">// --+ lifetime of the reference
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">r</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">*</span><span class="n">q</span><span class="p">;</span><span class="w">        </span><span class="c1">//   | as defined today
</span></span></span><span class="line"><span class="cl"><span class="w">                           </span><span class="c1">//   |
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">free</span><span class="p">(</span><span class="n">p</span><span class="p">);</span><span class="w">           </span><span class="c1">//   |
</span></span></span><span class="line"><span class="cl"><span class="w">                           </span><span class="c1">//   |
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">use</span><span class="p">(</span><span class="n">r</span><span class="p">);</span><span class="w"> </span><span class="c1">// &lt;------------+
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>If this seems like a bad idea, it is. The idea that writing unsafe
Rust code might be <strong>even more subtle</strong> than writing C seems like a
non-starter to me. =)</p>
<p>Now, you might be tempted to think that this problem is an artifact of
how Rust lifetimes are currently tied to scoping. After all, <code>q</code> is
not used after the <code>let r = *q</code> statement, and if we adopted the
<a href="http://smallcultfollowing.com/babysteps/blog/2016/04/27/non-lexical-lifetimes-introduction/">non-lexical lifetimes</a> approach, that would mean the lifetime
would end there. But really this problem could still occur in a
NLL-based system, though you have to work a bit harder:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">alloc_free2</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">p</span>: <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="kt">i32</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">allocate_an_integer</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">q</span>: <span class="kp">&amp;</span><span class="kt">i32</span> <span class="o">=</span><span class="w"> </span><span class="o">&amp;*</span><span class="n">p</span><span class="p">;</span><span class="w"> </span><span class="c1">// --------+
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">r</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">*</span><span class="n">q</span><span class="p">;</span><span class="w">            </span><span class="c1">//     |
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">if</span><span class="w"> </span><span class="n">condition1</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">      </span><span class="c1">//     |
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">free</span><span class="p">(</span><span class="n">p</span><span class="p">);</span><span class="w">           </span><span class="c1">//     |
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">                      </span><span class="c1">//     |
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">if</span><span class="w"> </span><span class="n">condition2</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">      </span><span class="c1">//     |
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">use</span><span class="p">(</span><span class="n">r</span><span class="p">);</span><span class="w">            </span><span class="c1">//     |
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">if</span><span class="w"> </span><span class="n">condition3</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">  </span><span class="c1">//     |
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="n">use_again</span><span class="p">(</span><span class="o">*</span><span class="n">q</span><span class="p">);</span><span class="w"> </span><span class="c1">// &lt;---+
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Here the problem is that, from the compiler&rsquo;s point of view, the
reference <code>q</code> is live at the point where we call <code>free</code>. This is
because it looks like we might need it to call <code>use_again</code>.  But in
fact the <em>programmer</em> knows that <code>condition1()</code> and <code>condition3()</code> are
mutually exclusive, and so she may reason that the lifetime of <code>q</code>
ends earlier when <code>condition1()</code> holds than when it doesn&rsquo;t.</p>
<p>So I think it seems clear from these examples that we can&rsquo;t really
fully trust types everywhere.</p>
<h4 id="trust-types-not-lifetimes">Trust types, not lifetimes?</h4>
<p><strong>I think that whatever guidelines we wind up with, we will not be
able to fully trust lifetimes, at least not around unsafe code.</strong> We
have to assume that memory may be invalidated early. Put another way,
the validity of some unsafe code ought not to be determined by the
results of lifetime inference, since mere mortals (including its
authors) cannot always predict what it will do.</p>
<p>But there is a more subtle reason that we should not &ldquo;trust
lifetimes&rdquo;. <strong>The Rust type system is a conservative analysis that
guarantees safety &ndash; but there are many notions of a reference&rsquo;s
&ldquo;lifetime&rdquo; that go beyond its capabilities.</strong> We saw this in the
previous section: today we have lexical lifetimes. Tomorrow we may
have non-lexical lifetimes. But humans can go beyond that and think
about conditional control-flow and other factors that the compiler is
not aware of. We should not expect humans to limit themselves to what
the Rust type system can express when writing unsafe code!</p>
<p>The idea here is that lifetimes are <em>sometimes</em> significant to the
model &ndash; in particular, in safe code, the compiler&rsquo;s lifetimes can be
used to aid optimization. But in unsafe code, we are required to
assume that the user gets to pick the lifetimes for each reference,
but those choices must still be valid choices that would type check. I
think that in practice this would roughly amount to &ldquo;trust lifetimes
in safe contexts, but not in unsafe contexts.</p>
<h4 id="impact-of-ignoring-lifetimes-altogether">Impact of ignoring lifetimes altogether</h4>
<p>This implies that the compiler will have to use the loads that the
user wrote to guide it. For example, you might imagine that the the
compiler can move a load from <code>x</code> down in the control-flow graph,
<strong>but only if it can see that <code>x</code> was going to be loaded anyway</strong>. So
if you consider this variant of <code>alloc_free</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">alloc_free3</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">p</span>: <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="kt">i32</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">allocate_an_integer</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">q</span>: <span class="kp">&amp;</span><span class="kt">i32</span> <span class="o">=</span><span class="w"> </span><span class="o">&amp;*</span><span class="n">p</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">r</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">*</span><span class="n">q</span><span class="p">;</span><span class="w"> </span><span class="c1">// load but do not use
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">free</span><span class="p">(</span><span class="n">p</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">use</span><span class="p">(</span><span class="o">*</span><span class="n">q</span><span class="p">);</span><span class="w"> </span><span class="c1">// not `use(r)` but `use(*q)` instead
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Here we can choose to either eliminate the first load (<code>let r = *q</code>)
or else replace <code>use(*q)</code> with <code>use(r)</code>. Either is ok: we have
evidence that the <em>user</em> believes the lifetime of <code>q</code> to enclose
<code>free</code>. (The fact that it doesn&rsquo;t is their fault.)</p>
<p>But now lets return to our <code>patsy()</code> function. Can we still optimize
that?</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">patsy</span><span class="p">(</span><span class="n">v</span>: <span class="kp">&amp;</span><span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">usize</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">l</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">*</span><span class="n">v</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">collaborator</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">use</span><span class="p">(</span><span class="n">l</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>If we are just ignoring the lifetime of <code>v</code>, then we can&rsquo;t &ndash; at least
not on the basis of the type of <code>v</code>. For all we know, the user
considers the lifetime of <code>v</code> to end right after <code>let l = *v</code>. That&rsquo;s
not so unreasonable as it might sound; after all, the code looks to
have been deliberately written to load <code>*v</code> early. And after all, we
are trying to enable more advanced notions of lifetimes than those
that the Rust type system supports today.</p>
<p>It&rsquo;s interesting that if we inlined <code>patsy</code> into its caller, we might
learn new information about its arguments that lets us optimize more
aggressively. For example, imagine a (benevolent, this time) caller
like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">kindly_fn</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">patsy</span><span class="p">(</span><span class="n">x</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">use</span><span class="p">(</span><span class="o">*</span><span class="n">x</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>If we inlined <code>patsy</code> into <code>kindly_fn</code>, we get this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">kindly_fn</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">l</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">*</span><span class="n">x</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">collaborator</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">use</span><span class="p">(</span><span class="n">l</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">use</span><span class="p">(</span><span class="o">*</span><span class="n">x</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Here we can see that <code>*x</code> must be valid after <code>collaborator()</code>, and so
we can optimize the function as follows (we are moving the load of
<code>*x</code> down, and then applying CSE to eliminate the double load):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">kindly_fn</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="mi">1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">collaborator</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">l</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">*</span><span class="n">x</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">use</span><span class="p">(</span><span class="n">l</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">use</span><span class="p">(</span><span class="n">l</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p><strong>There is a certain appeal to &ldquo;trust types, not lifetimes&rdquo;, but
ultimately I think it is not living up to Rust&rsquo;s potential</strong>: as you
can see above, we will still be fairly reliant on inlining to recover
needed context for optimizing. Given that the vast majority of Rust is
safe code, where these sorts of operations are harmless, this seems
like a shame.</p>
<h4 id="trust-lifetimes-only-in-safe-code">Trust lifetimes only in safe code?</h4>
<p>An alternative to the TPM is the
<a href="https://github.com/nikomatsakis/rust-memory-model/issues/26">&ldquo;Asserting-Conflicting Access&rdquo; model</a> (ACA), which was proposed
by arielb1 and ubsan. I don&rsquo;t claim to be precisely representing their
model here: I&rsquo;m trying to (somewhat separately) work through those
rules and apply them formally. So what I write here is more &ldquo;inspired
by&rdquo; those rules than reflective of it.</p>
<p>That caveat aside, the idea in their model is that lifetimes are
significant to the model, but you can&rsquo;t trust the compiler&rsquo;s inference
in unsafe code. There, we have to assume that the unsafe code author
is free to pick any valid lifetime, so long as it would still <em>type
check</em> (not &ldquo;borrow check&rdquo; &ndash; i.e., it only has to ensure that no data
outlives its owning scope). <strong>Note the similarities to the Tootsie Pop
Model here &ndash; we still need to define what an &ldquo;unsafe context&rdquo; is, and
when we enter such a context, the compiler will be less aggressive in
optimizing (though more aggressive than in the TPM).</strong> (This has
implications for the <a href="http://smallcultfollowing.com/babysteps/blog/2016/08/18/tootsie-pop-followup/">unchecked-get</a> example.)</p>
<p>Nonetheless, I have concerns about this formulation because it seems
to assume that the logic for unsafe code <em>can</em> be expressed in terms
of Rust&rsquo;s lifetimes &ndash; but as I wrote above Rust&rsquo;s lifetimes are
really a conservative approximation. As we improve our type system,
they can change and become more precise &ndash; and users might have in
mind more precise and flow-dependent lifetimes still. In particular,
it seems like the &ldquo;ACA&rdquo; would disallow my <code>alloc_free2</code> example:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">alloc_free2</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">p</span>: <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="kt">i32</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">allocate_an_integer</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">q</span>: <span class="kp">&amp;</span><span class="kt">i32</span> <span class="o">=</span><span class="w"> </span><span class="o">&amp;*</span><span class="n">p</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">r</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">*</span><span class="n">q</span><span class="p">;</span><span class="w"> </span><span class="c1">// (1)
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">if</span><span class="w"> </span><span class="n">condition1</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">free</span><span class="p">(</span><span class="n">p</span><span class="p">);</span><span class="w"> </span><span class="c1">// (2)
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">if</span><span class="w"> </span><span class="n">condition2</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">use</span><span class="p">(</span><span class="n">r</span><span class="p">);</span><span class="w"> </span><span class="c1">// (3)
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">if</span><span class="w"> </span><span class="n">condition3</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="n">use_again</span><span class="p">(</span><span class="o">*</span><span class="n">q</span><span class="p">);</span><span class="w"> </span><span class="c1">// (4)
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Intuitively, the problem is that the lifetime of <code>q</code> must enclose the
points (1), (2), (3), and (4) that are commented above. But the user
knows that <code>condition1()</code> and <code>condition3()</code> are mutually exclusive,
so in their mind, the lifetime ends either when we reach point (2),
since they know that this means that point (4) is unreachable.</p>
<p>In terms of their model, the <em>conflicting access</em> would be (2) and the
<em>asserting access</em> would be (1). But I might be misunderstanding how
this whole thing works.</p>
<h4 id="trust-lifetimes-at-safe-fn-boundaries">Trust lifetimes at safe fn boundaries</h4>
<p>Nonetheless, perhaps we can do something <em>similar</em> to the ACA model
and say that: we can trust lifetimes in &ldquo;safe code&rdquo; but totally
disregard them in &ldquo;unsafe code&rdquo; (however we define that). If we
adopted these definitions, would that allow us to optimize <code>patsy()</code>?</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">patsy</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="o">&gt;</span><span class="p">(</span><span class="n">v</span>: <span class="kp">&amp;</span><span class="na">&#39;a</span> <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">usize</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">l</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">*</span><span class="n">v</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">collaborator</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">use</span><span class="p">(</span><span class="n">l</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Presuming <code>patsy()</code> is considered to be &ldquo;safe code&rdquo;, then the answer is
yes. This in turn implies that any unsafe callers are obligated to
consider <code>patsy()</code> as a &ldquo;block box&rdquo; in terms of what it might do with <code>'a</code>.</p>
<p>This flows quite naturally from a &ldquo;permissions&rdquo; perspective &mdash; giving
a reference to a safe fn implies giving it permission to use that
reference <em>any time during its execution</em>. I have been (separately)
trying to elaborate this notion, but it&rsquo;ll have to wait for a separate post.</p>
<h3 id="conclusion">Conclusion</h3>
<p><strong>One takeaway from this meandering walk is that, if we want to make
it easy to optimize Rust code aggressively, there <em>is</em> something
special about the fn boundary.</strong> In retrospect, this is really not
that surprising: we are trying to enable intraprocedural optimization,
and hence the fn boundary is the boundary beyond which we cannot
analyze &ndash; within the fn body we can see more.</p>
<p>Put another way, if we want to optimize <code>patsy()</code> without doing any
interprocedural analysis, it seems clear that we <em>need</em> the caller to
guarantee that <code>v</code> will be valid for the entire call to <code>patsy</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">patsy</span><span class="p">(</span><span class="n">v</span>: <span class="kp">&amp;</span><span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">usize</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">l</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">*</span><span class="n">v</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">collaborator</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">use</span><span class="p">(</span><span class="n">l</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>I think this is an interesting conclusion, even if I&rsquo;m not quite sure
where it leads yet.</p>
<p><strong>Another takeaway is that we have to be very careful trusting
lifetimes around unsafe code.</strong> Lifetimes of references are a tool
designed for use by the borrow checker: we should not use them to
limit the clever things that unsafe code authors can do.</p>
<h3 id="note-on-comments">Note on comments</h3>
<p>Comments are closed on this post. Please post any questions or
comments on <a href="https://internals.rust-lang.org/t/blog-post-thoughts-on-trusting-types-and-unsafe-code/4059">the internals thread</a> I&rsquo;m about to start. =)</p>
<p>Also, I&rsquo;m collecting unsafe-related posts into the <a href="http://smallcultfollowing.com/babysteps/blog/categories/unsafe/">unsafe category</a>.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/unsafe" term="unsafe" label="Unsafe"/></entry><entry><title type="html">'Tootsie Pop' Followup</title><link href="https://smallcultfollowing.com/babysteps/blog/2016/08/18/tootsie-pop-followup/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2016/08/18/tootsie-pop-followup/</id><published>2016-08-18T00:00:00+00:00</published><updated>2016-08-18T09:17:46-04:00</updated><content type="html"><![CDATA[<p>A little while back, I wrote up a tentative proposal I called the
<a href="http://smallcultfollowing.com/babysteps/blog/2016/05/27/the-tootsie-pop-model-for-unsafe-code/">&ldquo;Tootsie Pop&rdquo; model for unsafe code</a>. It&rsquo;s safe to say that this
model was not universally popular. =) There was quite a
<a href="http://internals.rust-lang.org/t/tootsie-pop-model-for-unsafe-code/3522/">long and fruitful discussion</a> on discuss. I wanted to write a
quick post summarizing my main take-away from that discussion and to
talk a bit about the plans to push the unsafe discussion forward.</p>
<!-- more --> 
<h3 id="the-importance-of-the-unchecked-get-use-case">The importance of the unchecked-get use case</h3>
<p>For me, the most important lesson was the importance of the &ldquo;unchecked
get&rdquo; use case. Here the idea is that you have some (safe) code which
is indexing into a vector:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">vec</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="kt">i32</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[</span><span class="o">..</span><span class="p">.];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">vec</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">    
</span></span></span></code></pre></div><p>You have found (by profiling, but of course) that this code is kind of
slow, and you have determined that the bounds-check caused by indexing
is a contributing factor. You can&rsquo;t rewrite the code to use iterators,
and you are quite confident that the index will always be in-bounds,
so you decide to dip your tie into <code>unsafe</code> by calling
<code>get_unchecked</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">vec</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="kt">i32</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[</span><span class="o">..</span><span class="p">.];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">vec</span><span class="p">.</span><span class="n">get_unchecked</span><span class="p">(</span><span class="n">i</span><span class="p">)</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">    
</span></span></span></code></pre></div><p>Now, under the precise model that I proposed, this means that the
entire containing module is considered to be within an unsafe
abstraction boundary, and hence the compiler will be more conservative
when optimizing, and as a result the function may actually run
<strong>slower</strong> when you skip the bounds check than faster. (A very similar
example is invoking
<a href="https://doc.rust-lang.org/std/str/fn.from_utf8_unchecked.html"><code>str::from_utf8_unchecked</code></a>,
which skips over the utf-8 validation check.)</p>
<p>Many people were not happy about this side-effect, and I can totally
understand why. After all, this code isn&rsquo;t mucking about with funny
pointers or screwy aliasing &ndash; the unsafe block is a kind of drop-in
replacement for what was there before, so it seems odd for it to have
this effect.</p>
<h3 id="where-to-go-from-here">Where to go from here</h3>
<p>Since posting the last blog post, we&rsquo;ve started a
<a href="https://internals.rust-lang.org/t/next-steps-for-unsafe-code-guidelines/3864">longer-term process</a> for settling and exploring a lot of these
interesting questions about the proper use of unsafe. At this point,
we&rsquo;re still in the &ldquo;data gathering&rdquo; phase. The idea here is to collect
and categorize interesting examples of unsafe code. I&rsquo;d prefer at this
point not to be making decisions per se about what is legal or not &ndash;
although in some cases someting may be quite unambiguous &ndash; but rather
just try to get a good corpus with which we can evaluate different
proposals.</p>
<p>While I haven&rsquo;t given up on the &ldquo;Tootsie Pop&rdquo; model, I&rsquo;m also not
convinced it&rsquo;s the best approach. But whatever we do, I still believe
we should strive for something that is <strong>safe and predictable by
default</strong> &ndash; something where the rules can be summarized on a
postcard, at least if you don&rsquo;t care about getting every last bit of
optimization. But, as the unchecked-get example makes clear, it is
important that we also enable people to obtain full optimization,
possibly with some amount of opt-in. I&rsquo;m just not yet sure what&rsquo;s the
right setup to balance the various factors.</p>
<p>As I wrote in my last post, I think that we have to expect that
whatever guidelines we establish, they will have only a limited effect
on the kind of code that people write. So if we want Rust code to be
reliable <strong>in practice</strong>, we have to strive for rules that permit the
things that people actually do: and the best model we have for that is
the extant code. This is not to say we have to achieve total backwards
compatibility with any piece of unsafe code we find in the wild, but
if we find we are invalidating a common pattern, it can be a warning
sign.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/unsafe" term="unsafe" label="Unsafe"/></entry><entry><title type="html">The 'Tootsie Pop' model for unsafe code</title><link href="https://smallcultfollowing.com/babysteps/blog/2016/05/27/the-tootsie-pop-model-for-unsafe-code/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2016/05/27/the-tootsie-pop-model-for-unsafe-code/</id><published>2016-05-27T00:00:00+00:00</published><updated>2016-05-27T12:12:15-04:00</updated><content type="html"><![CDATA[<p>In my <a href="http://smallcultfollowing.com/babysteps/blog/2016/05/23/unsafe-abstractions/">previous post</a>, I spent some time talking about the idea of
<strong>unsafe abstractions</strong>. At the end of the post, I mentioned that Rust
does not really have any kind of official guidelines for what kind of
code is legal in an unsafe block and what is not.What this means in
practice is that people wind up writing what &ldquo;seems reasonable&rdquo; and
checking it against what the compiler does today. This is of course a
risky proposition since it means that if we start doing more
optimization in the compiler, we may well wind up breaking unsafe code
(the code would still compile; it would just not execute like it used
to).</p>
<p>Now, of course, merely having published guidelines doesn&rsquo;t entirely
change that dynamic. It does allow us to &ldquo;assign blame&rdquo; to the unsafe
code that took actions it wasn&rsquo;t supposed to take. But at the end of
the day we&rsquo;re still causing crashes, so that&rsquo;s bad.</p>
<p>This is partly why I have advocated that I want us to try and arrive
at guidelines which are &ldquo;human friendly&rdquo;. Even if we <em>have</em> published
guidelines, I don&rsquo;t expect most people to read them in practice. And
fewer still will read past the introduction. So we had better be sure
that &ldquo;reasonable code&rdquo; works by default.</p>
<p>Interestingly, there is something of a tension here: the more unsafe
code we allow, the less the compiler can optimize. This is because it
would have to be conservative about possible aliasing and (for
example) avoid reordering statements. We&rsquo;ll see some examples of this
as we go.</p>
<p>Still, to some extent, I think it&rsquo;s possible for us to have our cake
and eat it too. In this blog post, I outline a proposal to <strong>leverage
unsafe abstaction boundaries</strong> to inform the compiler where it can be
aggressive and where it must be conservative. The heart of the
proposal is the intution that:</p>
<ul>
<li>when you enter the unsafe boundary, you can rely that the Rust type
system invariants hold;</li>
<li>when you exit the unsafe boundary, you must ensure that the Rust
type system invariants are restored;</li>
<li>in the interim, you can break a lot of rules (though not all the
rules).</li>
</ul>
<p>I call this the <strong>Tootsie Pop</strong> model: the idea is that an unsafe
abstraction is kind of like a <a href="https://en.wikipedia.org/wiki/Tootsie_Pop">Tootsie Pop</a>. There is a gooey candy
interior, where the rules are squishy and the compiler must be
conservative when optimizing. This is separated from the outside world
by a hard candy exterior, which is the interface, and where the rules
get stricter.  Outside of the pop itself lies the safe code, where the
compiler ensures that all rules are met, and where we can optimize
aggressively.</p>
<p>One can also compare the approach to what would happen when writing a
C plugin for a Ruby interpreter. In that case, your plugin can assume
that the inputs are all valid Ruby objects, and it must produce valid
Ruby objects as its output, but internally it can cut corners and use
C pointers and other such things.</p>
<p>In this post, I will elaborate a bit more on the model, and in
particular cover some example problem cases and talk about the grey
areas that still need to be hammered out.</p>
<!-- more -->
<h4 id="how-do-you-define-an-unsafe-boundary">How do you define an unsafe boundary?</h4>
<p>My initial proposal is that we should define an unsafe boundary as
being &ldquo;a module that unsafe code somewhere inside of it&rdquo;. So, for
example, the module that contains <code>split_at_mut</code>, which we have seen
earlier is a fn defined with unsafe code, would form an unsafety
boundary. Public functions in this module would therefore be &ldquo;entry
points&rdquo; into the unsafe boundary; returning from such a function, or
issuing a callback via a closure or trait method, would be an exit
point.</p>
<p>Initially when considering this proposal, I wanted to use a an unsafe
boundary defined at the function granularity. So any function which
contained an unsafe block but which did not contain <code>unsafe</code> in its
signature would be considered the start of an unsafe boundary; and any
<code>unsafe fn</code> would be a part of its callers boundary (note that its
caller must contain an unsafe block). This would mean that
e.g. <code>split_at_mut</code> is its own unsafe boundary. However, I have come
to think that this definition is too precise and could cause problems
in practice &ndash; we&rsquo;ll see some examples below. Therefore, I have
loosened it.</p>
<p>Ultimately I think that deciding where to draw the unsafe boundary is
still somewhat of an open question. Even using the module barrier
means that some kinds of refactorings that might seem innocent
(migrating code between modules, specifically) can change code from
legal to illegal. I will discuss various alternatives later on.</p>
<h4 id="permissions-grantedrequired-at-the-unsafe-boundary">Permissions granted/required at the unsafe boundary</h4>
<p>In the model I am proposing, most of your reasoning happens as you
cross into or out of an unsafe abstraction. When you enter into an
unsafe abstraction &ndash; for example, by calling a method like
<code>split_at_mut</code>, which is not declared as <code>unsafe</code> but uses <code>unsafe</code>
code internally &ndash; you implicitly provide that function with certain
permissions. These permissions are derived from the types of the
function&rsquo;s arguments and the rules of the Rust type system. In the
case of <code>split_at_mut</code>, there are two arguments:</p>
<ul>
<li>The slice <code>self</code> that is being split, of type <code>&amp;'a mut [T]</code>; and,</li>
<li>the midpoint <code>mid</code> at which to perform the split, of type <code>usize</code>.</li>
</ul>
<p>Based on these types, the <code>split_at_mut</code> method can assume that the
variable <code>self</code> refers to a suitably initialized slice of values of
type <code>T</code>. That reference is valid for the lifetime <code>'a</code>, which
represents some span of execution time that encloses at least the
current call to <code>split_at_mut</code>. Similarly, the argument <code>mid</code> will be
an unsigned integer of suitable size.</p>
<p>At this point we are within the unsafe abstraction. It is now free to
do more-or-less whatever it likes, so long as all the actions it takes
fall within the initial set of permissions. More on this below.</p>
<p>Finally, when you exit from the unsafe boundary, you must ensure that
you have restored whatever invariants and permissions the Rust type
system requires. These are typically going to be derived from the
types of the function&rsquo;s outputs, such as its return type. In the case
of <code>split_at_mut</code>, the return type is <code>(&amp;mut [T], &amp;mut [T])</code>, so this
implies that you will return a tuple of slices. Since those slices are
both active at the same time, they must (by the rules of Rust&rsquo;s type
system) refer to disjoint memory.</p>
<h4 id="specifying-the-permissions">Specifying the permissions</h4>
<p>In this post, I am not trying to define the complete set of
permissions. We have a reasonably good but not formalized notion of
what these permissions are. Ralf Jung and Derek Dryer have been
working on making that model more precise as part of the <a href="http://plv.mpi-sws.org/rustbelt/">Rust Belt</a>
project. I think writing up those rules in one central place would
obviously be a big part of elaboring on the model I am sketching out
here.</p>
<p>If you are writing safe code, the type system will ensure that you
never do anything that exceeds the permissions granted to you. But if
you dip into unsafe code, then you take on the responsibility for
verifying that you obey the given permissions. Either way, the set of
permissions remain the same.</p>
<h4 id="permissons-on-functions-declared-as-unsafe">Permissons on functions declared as unsafe</h4>
<p>If a function is declared as unsafe, then its permissions are not
defined by the type system, but rather in comments and documentation.
This is because the <code>unsafe</code> keyword is a warning that the function
arguments may have additional requirements of its caller &ndash; or may
return values that don&rsquo;t meet the full requirements of the Rust type
system.</p>
<h4 id="optimizations-within-an-unsafe-boundary">Optimizations within an unsafe boundary</h4>
<p>So far I&rsquo;ve primarily talked about what happens when you <strong>cross</strong> an
unsafe boundary, but I&rsquo;ve not talked much about what you can do
<strong>within</strong> an unsafe boundary. Roughly speaking, the answer that I
propose is: &ldquo;whatever you like, so long as you don&rsquo;t exceed the
initial set of permissions you were given&rdquo;.</p>
<p>What this means in practice is that when the compiler is optimizing
code that originates inside an unsafe boundary, it will make
pessimistic assumptions about aliasing. This is effectively what C
compilers do today (except they sometimes employ
<a href="http://www.drdobbs.com/cpp/type-based-alias-analysis/184404273">type-based alias analysis</a>; we would not).</p>
<p>As a simple example: in safe code, if you have two distinct variables
that are both of type <code>&amp;mut T</code>, the compiler would assume that they
represent disjoint memory. This might allow it, for example, to
re-order reads/writes or re-use values that have been read if it does
not see an intervening write. But if those same two variables appear
inside of an unsafe boundary, the compiler would not make that
assumption when optimizing. If that was too hand-wavy for you, don&rsquo;t
worry, we&rsquo;ll spell out these examples and others in the next section.</p>
<h3 id="examples">Examples</h3>
<p>In this section I want to walk through some examples. Each one
contains unsafe code doing something potentially dubious. In each
case, I will do the following:</p>
<ol>
<li>walk through the example and describe the dubious thing;</li>
<li>describe what my proposed rules would do;</li>
<li>describe some other rules one might imagine and what their
repercussions might be.</li>
</ol>
<p>By the way, I have been <a href="https://github.com/nikomatsakis/rust-memory-model/">collecting these sorts of examples</a> in a
repository, and am very interested in seeing more such dubious cases
which might offer insight into other tricky situations. The names of
the sections below reflect the names of the files in that repository.</p>
<h4 id="split-at-mut-via-duplication">split-at-mut-via-duplication</h4>
<p>Let&rsquo;s start with a familiar example. This is a variant of the familiar
<code>split_at_mut</code> method that I covered in <a href="http://smallcultfollowing.com/babysteps/blog/2016/05/23/unsafe-abstractions/">the previous post</a>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">]</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">split_at_mut</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">mid</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">],</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">])</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">copy</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="o">*</span><span class="p">(</span><span class="bp">self</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">_</span><span class="p">)</span><span class="w"> </span><span class="p">};</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">left</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">[</span><span class="mi">0</span><span class="o">..</span><span class="n">mid</span><span class="p">];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">right</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">copy</span><span class="p">[</span><span class="n">mid</span><span class="o">..</span><span class="p">];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">(</span><span class="n">left</span><span class="p">,</span><span class="w"> </span><span class="n">right</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">    
</span></span></span></code></pre></div><p>This version works differently from the ones I showed before. It
doesn&rsquo;t use raw pointers. Instead, it cheats the compiler by
&ldquo;duplicating&rdquo; <code>self</code> via a cast to <code>*mut</code>. This means that both <code>self</code>
and <code>copy</code> are <code>&amp;mut [T]</code> slices pointing at the same memory, at the
same time. In ordinary, safe Rust, this is impossible, but using
unsafe code, we can make it happen.</p>
<p>The rest of the function looks almost the same as our original attempt
at a safe implementation (also in the <a href="http://smallcultfollowing.com/babysteps/blog/2016/05/23/unsafe-abstractions/">previous post</a>). The only
difference now is that, in defining <code>right</code>, it uses <code>copy[mid..]</code>
instead of <code>self[mid..]</code>. The compiler accepts this because it assumes
that <code>copy</code> and <code>self</code>, since they are both simultaneously valid, must
be disjoint (remember that, in unsafe code, the borrow checker still
enforces its rules on safe typess, it&rsquo;s just that we can use tricks
like raw pointers or transmutes to sidestep them).</p>
<p><strong>Why am I showing you this?</strong> The key question here is whether the
optimizer can &ldquo;trust&rdquo; Rust types within an unsafe boundary. After all,
this code is only accepted because the borrowck thinks (incorrectly)
that <code>self</code> and <code>copy</code> are disjoint; if the optimizer were to think
the same thing, that could lead to bad optimizations.</p>
<p><strong>My belief is that this program ought to be legal.</strong> One reason is
just that, when I first implemented <code>split_at_mut</code>, it&rsquo;s the most
natural thing that I thought to write. And hence I suspect that many
others would write unsafe code of this kind.</p>
<p>However, to put this in terms of the model, the idea is that the
unsafe boundary here would be the module containing
<code>split_at_mut</code>. Thus the dubious aliasing between <code>left</code> and <code>right</code>
occurs <strong>within</strong> this boundary. In general, my belief is that
whenever we are <strong>inside</strong> the boundary we cannot fully trust the
types that we see. We can only assume that the user is supplying the
types that seem most appropriate to them, not necessarily that they
are accounting for the full implications of those types under the
normal Rust rules. When optimizing, then, the compiler will <em>not</em>
assume that the normal Rust type rules apply &ndash; effectively, it will
treat <code>&amp;mut</code> references the same way it might treat a <code>*mut</code> or
<code>*const</code> pointer.</p>
<p>(I have to work a bit more at understanding LLVM&rsquo;s annotations, but I
think that we can model this using the <a href="http://llvm.org/docs/LangRef.html#noalias-and-alias-scope-metadata">aliasing metadata</a> that LLVM
provides. More on that later.)</p>
<p><strong>Alternative models.</strong> Naturally alternative models might consider
this code illegal. They would require that one use raw pointers, as
the current implementation does, for any pointer that does not
necessarily obey Rust&rsquo;s memory model.</p>
<p>(Note that this raises another interesting question, though, about
what the legal aliasing is between (say) a <code>&amp;mut</code> and a <code>*mut</code> that
are actively in use &ndash; after all, an <code>&amp;mut</code> is supposed to be unique,
but does that uniqueness cover raw pointers?)</p>
<h4 id="refcell-ref">refcell-ref</h4>
<p>The <code>borrow()</code> method on the type <code>RefCell</code> employs a helper type that
returns a value of a helper type called <code>Ref</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Ref</span><span class="o">&lt;</span><span class="na">&#39;b</span><span class="p">,</span><span class="w"> </span><span class="n">T</span>: <span class="o">?</span><span class="nb">Sized</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="na">&#39;b</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">value</span>: <span class="kp">&amp;</span><span class="na">&#39;b</span> <span class="nc">T</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">borrow</span>: <span class="nc">BorrowRef</span><span class="o">&lt;</span><span class="na">&#39;b</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Here the <code>value</code> field is a reference to the interior of the
<code>RefCell</code>, and the <code>borrow</code> is a value which, once dropped, will cause
the &ldquo;lock&rdquo; on the <code>RefCell</code> to be released. This is important because
it means that once <code>borrow</code> is dropped, <code>value</code> can no longer safely
be used. (You could imagine the helper type <code>MutexGuard</code> employing a
similar pattern, though actually it works ever so slightly differently
for whatever reason.)</p>
<p>This is another example of unsafe code is using the Rust types in a
&ldquo;creative&rdquo; way. In particular, the type <code>&amp;'b T</code> is supposed to mean: a
reference that can be safely used right up until the end of <code>'b</code> (and
whose referent will not be mutated). However, in this case, the actual
meaning is &ldquo;until the end of <code>'b</code> or until <code>borrow</code> is dropped,
whichever comes first&rdquo;.</p>
<p>So let&rsquo;s consider some imaginary method defined on <code>Ref</code>,
<code>copy_drop()</code>, which works when <code>T == u32</code>. It would copy the value
and then drop the borrow to release the lock.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">use</span><span class="w"> </span><span class="n">std</span>::<span class="n">mem</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="na">&#39;b</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Ref</span><span class="o">&lt;</span><span class="na">&#39;b</span><span class="p">,</span><span class="w"> </span><span class="kt">u32</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">copy_drop</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">u32</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">t</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">*</span><span class="bp">self</span><span class="p">.</span><span class="n">value</span><span class="p">;</span><span class="w"> </span><span class="c1">// copy contents of `self.value` into `t`
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">mem</span>::<span class="nb">drop</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">borrow</span><span class="p">);</span><span class="w"> </span><span class="c1">// release the lock
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">t</span><span class="w"> </span><span class="c1">// return what we read before
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Note that there is <strong>no unsafe code</strong> in this function at all. I claim
then that the Rust compiler would, ideally, be within its rights to
rearrange this code and to delay the load of <code>self.value</code> to occur later,
sort of like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">mem</span>::<span class="nb">drop</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">borrow</span><span class="p">);</span><span class="w"> </span><span class="c1">// release the lock
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">t</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">*</span><span class="bp">self</span><span class="p">.</span><span class="n">value</span><span class="p">;</span><span class="w"> </span><span class="c1">// copy contents of `self.value` into `t`
</span></span></span><span class="line"><span class="cl"><span class="n">t</span><span class="w"> </span><span class="c1">// return what we read before
</span></span></span></code></pre></div><p>This might seem surprising, but the idea here is that the type of
<code>self.value</code> is <code>&amp;'b u32</code>, which is supposed to mean a reference valid
for all of <code>'b</code>.  Moreover, the lifetime <code>'b</code> encloses the entire call
to <code>copy_drop</code>. Therefore, the compiler would be free to say &ldquo;well,
maybe I can save a register if I move this load down&rdquo;.</p>
<p>However, I think that reordering this code would be an invalid
optimization.  Logically, as soon as <code>self.borrow</code> is dropped,
<code>*self.value</code> becomes inaccessible &ndash; if you imagine that this pattern
were being used for a mutex, you can see why: another thread might
acquire the lock!</p>
<p>Note that because these fields are private, this kind of problem can
only arise for the methods defined on <code>Ref</code> itself. The public cannot
gain access to the raw <code>self.value</code> reference. They must go through
the deref trait, which returns a reference for some shorter lifetime
<code>'r</code>, and that lifetime <code>'r</code> always ends before the ref is dropped.
So if you were to try and write the same <code>copy_drop</code> routine from the
outside, there would be no problem:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">some_ref</span>: <span class="nc">Ref</span><span class="o">&lt;</span><span class="kt">u32</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ref_cell</span><span class="p">.</span><span class="n">borrow</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">t</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">*</span><span class="n">some_ref</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">mem</span>::<span class="nb">drop</span><span class="p">(</span><span class="n">some_ref</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">use</span><span class="p">(</span><span class="n">t</span><span class="p">);</span><span class="w">
</span></span></span></code></pre></div><p>In particular, the <code>let t = *some_ref</code> desugars to something like:</p>
<pre tabindex="0"><code>let t = {
    let ptr: &amp;u32 = Deref::deref(&amp;some_ref);
    *ptr
};
</code></pre><p>Here the lifetime of <code>ptr</code> is just going to be that little enclosing
block there.</p>
<p><strong>Why am I showing you this?</strong> This example illustrates that, in the
presence of <code>unsafe</code> code, the <code>unsafe</code> keyword itself is not
necessarily a reliable indicator to where &ldquo;funny business&rdquo; could
occur. Ultimately, I think what&rsquo;s important is the <strong>unsafe abstraction
barrier</strong>.</p>
<p><strong>My belief is that this program ought to be legal.</strong> Frankly, to me,
this code looks entirely reasonable, but also it&rsquo;s the kind of code I
expect people will write (after all, we wrote it). Examples like this
are why I chose to extend the unsafe boundary to enclose the <strong>entire
module</strong> that uses the unsafe keyword, rather than having it be at the
fn granularity &ndash; because there can be functions that, in fact, do
unsafe things where the full limitations on ordering and so forth are
not apparent, but which do not directly involve unsafe code. Another
classic example is modifying the length or capacity fields on a
vector.</p>
<p>Now, I chose to extend to the enclosing, module because it corresponds
to the privacy boundary, and there can be no unsafe abstraction
barrier without privacy. But I&rsquo;ll explain below why this is not a
perfect choice and we might consider others.</p>
<h4 id="usize-transfer">usize-transfer</h4>
<p>Here we have a trio of three functions. These functions collaborate
to hide a reference in a <code>usize</code> and then later dereference it:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// Cast the reference `x` into a `usize`
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">escape_as_usize</span><span class="p">(</span><span class="n">x</span>: <span class="kp">&amp;</span><span class="kt">i32</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">usize</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// interestingly, this cast is currently legal in safe code,
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// which is a mite unfortunate, but doesn&#39;t really affect
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// the example
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">x</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="o">*</span><span class="k">const</span><span class="w"> </span><span class="n">_</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="kt">usize</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// Cast `x` back into a pointer and dereference it 
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">consume_from_usize</span><span class="p">(</span><span class="n">x</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">i32</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">y</span>: <span class="kp">&amp;</span><span class="kt">i32</span> <span class="o">=</span><span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">&amp;*</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="o">*</span><span class="k">const</span><span class="w"> </span><span class="kt">i32</span><span class="p">)</span><span class="w"> </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">*</span><span class="n">y</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">entry_point</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">x</span>: <span class="kt">i32</span> <span class="o">=</span><span class="w"> </span><span class="mi">2</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">p</span>: <span class="kt">usize</span> <span class="o">=</span><span class="w"> </span><span class="n">escape_as_usize</span><span class="p">(</span><span class="o">&amp;</span><span class="n">x</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// (*) At this point, `p` is in fact a &#34;pointer&#34; to `x`, but it
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// doesn&#39;t look like it!
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;</span><span class="si">{}</span><span class="s">&#34;</span><span class="p">,</span><span class="w"> </span><span class="n">consume_from_usize</span><span class="p">(</span><span class="n">p</span><span class="p">));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The key point in this example is marked with a <code>(*)</code>. At that point,
we have effected created a pointer to <code>x</code> and stored it in <code>p</code>, but
the type of <code>p</code> does not reflect that (it just says it&rsquo;s a
pointer-sized integer). Note also that <code>entry_point</code> does not itself
contain unsafe code (further evidence that private helper functions
can easily cause unsafe reasoning to spread beyond the border of a
single fn). So the compiler might assume that the stack slot <code>x</code> is
dead and reuse the memory, or something like that.</p>
<p>There are a number of ways that this code might be made less shady.
<code>escape_as_usize</code> might have, for example, returned a <code>*const i32</code>
instead of <code>usize</code>. In that case, <code>consume_from_usize</code> would look like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">consume_from_usize</span><span class="p">(</span><span class="n">x</span>: <span class="o">*</span><span class="k">const</span><span class="w"> </span><span class="kt">i32</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">i32</span> <span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This itself raises a kind of interesting question though. If a
function is not declared as unsafe, and it is given a <code>*const i32</code>
argument, can it dereference that pointer? Ordinarily, the answer
would clearly be no. It has <strong>no idea</strong> what the provenance of that
pointer is (and if you think back to the idea of permissions that are
granted and expected by the Rust type system, the type system does
<strong>not</strong> guarantee you that a <code>*const</code> can be dereferenced). So
effectively there is no difference, in terms of the public
permissions, between <code>x: usize</code> and <code>x: *const i32</code>. Really I think
the <strong>best</strong> way to structure this code would have been to declare
<code>consume_from_usize()</code> as <code>unsafe</code>, which would have served to declare
to its callers that it has extra requirements regarding its argument
<code>x</code> (namely, that it must be a pointer that can be safely
dereferenced).</p>
<p>Now, if <code>consume_from_usize()</code> were a <strong>public</strong> function, then not
having an <code>unsafe</code> keyword would almost certainly be flat out
wrong. There is nothing that stops perfectly safe callers from calling
it with any old integer that they want; even if the signature were
changed to take <code>*const u32</code>, the same is basically true. But
<code>consume_from_usize()</code> is not public: it&rsquo;s private, and that perhaps
makes a difference.</p>
<p>It often happens, as we&rsquo;ve seen in the other examples, that people cut
corners within the unsafe boundary and declare private helpers as
&ldquo;safe&rdquo; that are in fact assuming quite a bit beyond the normal Rust
type rules.</p>
<p><strong>Why am I showing you this?</strong> This is a good example for playing with
the concept of an unsafe boundary. By moving these functions about,
you can easily create unsafety, as they must all three be contained
within the same unsafe boundary to be legal (if indeed they are legal
at all). Consider these variations:</p>
<p><strong>Private helper module.</strong></p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">mod</span> <span class="nn">helpers</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">escape_as_usize</span><span class="p">(</span><span class="n">x</span>: <span class="kp">&amp;</span><span class="kt">i32</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">usize</span> <span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">consume_from_usize</span><span class="p">(</span><span class="n">x</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">i32</span> <span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">entry_point</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="c1">// calls now written as `helpers::escape_as_usize` etc
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p><strong>Private helper module, but restriced scope to an outer scope.</strong></p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">mod</span> <span class="nn">helpers</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="p">(</span><span class="k">super</span><span class="p">)</span><span class="w"> </span><span class="k">fn</span> <span class="nf">escape_as_usize</span><span class="p">(</span><span class="n">x</span>: <span class="kp">&amp;</span><span class="kt">i32</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">usize</span> <span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="p">(</span><span class="k">super</span><span class="p">)</span><span class="w"> </span><span class="k">fn</span> <span class="nf">consume_from_usize</span><span class="p">(</span><span class="n">x</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">i32</span> <span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">entry_point</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="c1">// calls now written as `helpers::escape_as_usize` etc
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p><strong>Public functions, but restricted to an outer scope.</strong></p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">mod</span> <span class="nn">some_bigger_abstraction</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">mod</span> <span class="nn">helpers</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">pub</span><span class="p">(</span><span class="k">super</span><span class="p">)</span><span class="w"> </span><span class="k">fn</span> <span class="nf">escape_as_usize</span><span class="p">(</span><span class="n">x</span>: <span class="kp">&amp;</span><span class="kt">i32</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">usize</span> <span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">pub</span><span class="p">(</span><span class="k">super</span><span class="p">)</span><span class="w"> </span><span class="k">fn</span> <span class="nf">consume_from_usize</span><span class="p">(</span><span class="n">x</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">i32</span> <span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">pub</span><span class="p">(</span><span class="k">super</span><span class="p">)</span><span class="w"> </span><span class="k">fn</span> <span class="nf">entry_point</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">     
</span></span></span></code></pre></div><p><strong>Public functions, but de facto restricted to an outer scope.</strong></p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">mod</span> <span class="nn">some_bigger_abstraction</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">mod</span> <span class="nn">helpers</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">escape_as_usize</span><span class="p">(</span><span class="n">x</span>: <span class="kp">&amp;</span><span class="kt">i32</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">usize</span> <span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">consume_from_usize</span><span class="p">(</span><span class="n">x</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">i32</span> <span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">entry_point</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// no `pub use`, so in fact they are not accessible
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">     
</span></span></span></code></pre></div><p><strong>Just plain public.</strong></p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">escape_as_usize</span><span class="p">(</span><span class="n">x</span>: <span class="kp">&amp;</span><span class="kt">i32</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">usize</span> <span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">consume_from_usize</span><span class="p">(</span><span class="n">x</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">i32</span> <span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">entry_point</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p><strong>Different crates.</strong></p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// crate A:
</span></span></span><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">escape_as_usize</span><span class="p">(</span><span class="n">x</span>: <span class="kp">&amp;</span><span class="kt">i32</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">usize</span> <span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// crate B:
</span></span></span><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">consume_from_usize</span><span class="p">(</span><span class="n">x</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">i32</span> <span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// crate C:
</span></span></span><span class="line"><span class="cl"><span class="k">extern</span><span class="w"> </span><span class="k">crate</span><span class="w"> </span><span class="n">a</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">extern</span><span class="w"> </span><span class="k">crate</span><span class="w"> </span><span class="n">b</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">entry_point</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">p</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">a</span>::<span class="n">escape_as_usize</span><span class="p">(</span><span class="o">&amp;</span><span class="n">x</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">b</span>::<span class="n">consume_from_usize</span><span class="p">(</span><span class="n">p</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p><strong>My belief is that some of these variations ought to be legal.</strong> The
current model as I described it here would accept the original
variation (where everything is in one module) but reject all other
variations (that is, they would compile, but result in undefined
behavior). I am not sure this is right: I think that at least the
&ldquo;private helper module&rdquo; variations seems maybe reasonable.</p>
<p>Note that I think any or all of these variations should be fine with
appropriate use of the <code>unsafe</code> keyword. If the helper functions were
declared as <code>unsafe</code>, then I think they could live anywhere. (This is
actually an interesting point that deserves to be drilled into a bit
more, since it raises the question of how distinct unsafe boundaries
&ldquo;interact&rdquo;; I tend to think of there as just being safe and unsafe
code, full stop, and hence any time that unsafe code in one module
invokes unsafe code in another, we can assume they are part of the
same boundary and hence that we have to be conservative.)</p>
<h3 id="on-refactorings-harmless-and-otherwise">On refactorings, harmless and otherwise</h3>
<p>One interesting thing to think about with an kind of memory model or
other guidelines is what sorts of refactorings people can safely
perform. For example, under this model, <em>manually</em> inlining a fn body
is always safe, so long as you do so within an unsafe abstraction.
Inlining a function from inside an abstraction into the outside is
usually safe, but not necessarily &ndash; the reason it is usually safe is
that most such functions have <code>unsafe</code> blocks, and so by manually
inlining, you will wind up changing the caller from a safe function
into one that is part of the unsafe abstraction.</p>
<p>(Grouping items and functions into modules is another example that may
or may not be safe, depending on how we chose to draw the boundary
lines.)</p>
<p><strong>EDIT:</strong> To clarify a confusion I have seen in a few places. Here I
am talking about <em>inlining by the user</em>. Inlining by the compiler is
different. In that case, when we inline, we would track the
&ldquo;provenance&rdquo; of each instruction, and in particular we would track
whether the instruction originated from unsafe code. (As I understand
it, LLVM already does this with its aliased sets, because it is needed
for handling C99 <code>restrict</code>.) This means that when we decide e.g.  if
two loads may alias, if one (or both) of those loads originated in
unsafe code, then the answer would be different than if they did not.</p>
<h3 id="impact-of-this-proposal-and-mapping-it-to-llvm">Impact of this &ldquo;proposal&rdquo; and mapping it to LLVM</h3>
<p>I suspect that we are doing some optimizations now that would not be
legal under this proposal, though probably not that many &ndash; we haven&rsquo;t
gone very far in terms of translating Rust&rsquo;s invariants to LLVM&rsquo;s
alias analysis metadata. Note though that in general this proposal is
very optimization friendly: all safe code can be fully optimized.
Unsafe code falls back to more C-like reasoning, where one must be
conservative about potential aliasing (note that I do not want to
employ any <a href="http://www.drdobbs.com/cpp/type-based-alias-analysis/184404273">type-based alias analysis</a>, though).</p>
<p>I expect we may want to add some annotations that unsafe code can use
to recover optimizations. For example, perhaps something analogous to
the <code>restrict</code> keyword in C, to declare that pointers are unaliased,
or some way to say that an <code>unsafe</code> fn (or module) nonetheless ensures
that all safe Rust types meet their full requirements.</p>
<p>One of the next steps for me personally in exploring this model is to
try and map out (a) precisely what we do today and (b) how I would
express what I want in LLVM&rsquo;s terms. It&rsquo;s not the best formalization,
but it&rsquo;s a concrete starting point at least!</p>
<h3 id="tweaking-the-concept-of-a-boundary">Tweaking the concept of a boundary</h3>
<p>As the final example showed, a module boundary is not clearly right.
In particular, the idea of using a module is that it aligned to
privacy, but by that definition it should probably include submodules
(that is, any module where an unsafe keyword appears either in the
module or in some parent of the module is considered to be an unsafe
boundary module).</p>
<h3 id="conclusion">Conclusion</h3>
<p>Here I presented a high-level proposal for how I think a Rust &ldquo;memory
model&rdquo; ought to work. Clearly this doesn&rsquo;t resemble a formal memory
model and there are tons of details to work out. Rather, it&rsquo;s a
guiding principle: be aggressive outside of unsafe abstractions and
conservative inside.</p>
<p>I have two major concerns:</p>
<ul>
<li>First, what is the impact on execution time?  I think this needs to
be investigated, but ultimately I am sure we can overcome any
deficit by allowing unsafe code authors to &ldquo;opt back in&rdquo; to more aggressive
optimization, which feels like a good tradeoff.</li>
<li>Second, what&rsquo;s the best way to draw the optimization boundary?
Can we make it more explicit?</li>
</ul>
<p>In particular, the module-based rule that I proposed for the unsafe
boundary is ultimately a kind of heuristic that makes an &ldquo;educated
guess&rdquo; as to where the unsafe boundary lies. Certainly the boundary
must be aligned with modules, but as the last example showed, there
may be a lot of ways to set thigns up that &ldquo;seem reasonable&rdquo;. <strong>It
might be nicer if we could have a way to <em>declare</em> that boundary
affirmatively.</strong> I&rsquo;m not entirely sure that this looks like.  But if
we did add some way, we might then say that if you use the older
<code>unsafe</code> keyword &ndash; where the boundary is implicit &ndash; we&rsquo;ll just
declare the whole crate as being an &ldquo;unsafe boundary&rdquo;. This likely
won&rsquo;t break any code (though of course I mentioned the &ldquo;different
crates&rdquo; variation above&hellip;), but it would provide an incentive to use
the more explicit form.</p>
<p>For questions or discussion, please see
<a href="http://internals.rust-lang.org/t/tootsie-pop-model-for-unsafe-code/3522">this thread on the Rust internals forum</a>.</p>
<h3 id="edit-log">Edit log</h3>
<p>Some of the examples of dubious unsafe code originally used
<code>transmute</code> and <code>transmute_copy</code>.  I was asked to change them because
<code>transmute_copy</code> really is exceptionally unsafe, even for unsafe code
(type inference can make it go wildly awry from what you expected),
and so we didn&rsquo;t want to tempt anyone into copy-and-pasting them. For
the record: don&rsquo;t copy and paste the unsafe code I labeled as dubious
&ndash; it is indeed dubious and may not turn out to be legal! :)</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/unsafe" term="unsafe" label="Unsafe"/></entry><entry><title type="html">Unsafe abstractions</title><link href="https://smallcultfollowing.com/babysteps/blog/2016/05/23/unsafe-abstractions/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2016/05/23/unsafe-abstractions/</id><published>2016-05-23T00:00:00+00:00</published><updated>2016-05-23T08:17:07-04:00</updated><content type="html"><![CDATA[<p>The <code>unsafe</code> keyword is a crucial part of Rust&rsquo;s design. For those not
familiar with it, the <code>unsafe</code> keyword is basically a way to bypass
Rust&rsquo;s type checker; it essentially allows you to write something more
like C code, but using Rust syntax.</p>
<p>The existence of the <code>unsafe</code> keyword sometimes comes as a surprise at
first. After all, isn&rsquo;t the point of Rust that Rust programs should
not crash? Why would we make it so easy then to bypass Rust&rsquo;s type
system? It can seem like a kind of flaw in the design.</p>
<p>In my view, though, <code>unsafe</code> is anything but a flaw: in fact, it&rsquo;s a
critical piece of how Rust works. The <code>unsafe</code> keyword basically
serves as a kind of &ldquo;escape valve&rdquo; &ndash; it means that we can keep the
type system relatively simple, while still letting you pull whatever
dirty tricks you want to pull in your code. The only thing we ask is
that you package up those dirty tricks with some kind of abstraction
boundary.</p>
<p>This post introduces the <code>unsafe</code> keyword and the idea of unsafety
boundaries. It is in fact a lead-in for another post I hope to publish
soon that discusses a potential design of the so-called
<a href="https://github.com/rust-lang/rfcs/issues/1447">Rust memory model</a>, which is basically a set of rules that help
to clarify just what is and is not legal in unsafe code.</p>
<!-- more -->
<h3 id="unsafe-code-as-a-plugin">Unsafe code as a plugin</h3>
<p>I think a good analogy for thinking about how <code>unsafe</code> works in Rust
is to think about how an interpreted language like Ruby (or Python)
uses C modules. Consider something like the JSON module in Ruby. The
JSON bundle includes a pure Ruby implementation (<code>JSON::Pure</code>), but it
also includes a re-implementation of the same API in C
(<code>JSON::Ext</code>). By default, when you use the JSON bundle, you are
actually running C code &ndash; but your Ruby code can&rsquo;t tell the
difference. From the outside, that C code looks like any other Ruby
module &ndash; but internally, of course, it can play some dirty tricks and
make optimizations that wouldn&rsquo;t be possible in Ruby. (See this
excellent blog post on <a href="http://blog.skylight.io/introducing-helix/">Helix</a> for more details, as well as
some suggestions on how you can write Ruby plugins in Rust instead.)</p>
<p>Well, in Rust, the same scenario can arise, although the scale is
different. For example, it&rsquo;s perfectly possible to write an efficient
and usable hashtable in pure Rust. But if you use a bit of unsafe
code, you can make it go faster still. If this a data structure that
will be used by a lot of people or is crucial to your application,
this may be worth the effort (so e.g. we use unsafe code in the
standard library&rsquo;s implementation). But, either way, normal Rust code
should not be able to tell the difference: the unsafe code is
<strong>encapsulated</strong> at the API boundary.</p>
<p>Of course, just because it&rsquo;s <em>possible</em> to use unsafe code to make
things run faster doesn&rsquo;t mean you will do it frequently. Just like
the majority of Ruby code is in Ruby, the majority of Rust code is
written in pure safe Rust; this is particularly true since safe Rust
code is very efficient, so dropping down to unsafe Rust for
performance is rarely worth the trouble.</p>
<p>In fact, probably the single most common use of unsafe code in Rust is
for FFI. Whenever you call a C function from Rust, that is an unsafe
action: this is because there is no way the compiler can vouch for the
correctness of that C code.</p>
<h3 id="extending-the-language-with-unsafe-code">Extending the language with unsafe code</h3>
<p>To me, the most interesting reason to write unsafe code in Rust (or a
C module in Ruby) is so that you can extend the capabilities of the
language. Probably the most commonly used example of all is the <code>Vec</code>
type in the standard library, which uses unsafe code so it can handle
uninitialized memory; <code>Rc</code> and <code>Arc</code>, which enable shared ownership,
are other good examples. But there are also much fancier examples,
such as how <a href="https://github.com/aturon/crossbeam">Crossbeam</a> and <a href="https://github.com/kinghajj/deque">deque</a> use unsafe code to implement
non-blocking data structures, or <a href="https://github.com/rphmeier/jobsteal">Jobsteal</a> and <a href="http://smallcultfollowing.com/babysteps/blog/2015/12/18/rayon-data-parallelism-in-rust/">Rayon</a> use unsafe
code to implement thread pools.</p>
<p>In this post, we&rsquo;re going to focus on one simple case: the
<code>split_at_mut</code> method found in the standard library. This method is
defined over mutable slices like <code>&amp;mut [T]</code>. It takes as argument a
slice and an index (<code>mid</code>), and it divides that slice into two pieces
at the given index. Hence it returns two subslices: ranges from
<code>0..mid</code>, and one that ranges from <code>mid..</code>.</p>
<p>You might imagine that <code>split_at_mut</code> would be defined like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">]</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">split_at_mut</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">mid</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">],</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">])</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">[</span><span class="mi">0</span><span class="o">..</span><span class="n">mid</span><span class="p">],</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">[</span><span class="n">mid</span><span class="o">..</span><span class="p">])</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">    
</span></span></span></code></pre></div><p>If it compiled, this definition would do the right thing, but in fact
if you <a href="https://is.gd/2UpNUr">try to build it</a> you will find it gets a compilation
error. It fails for two reasons:</p>
<ol>
<li>In general, the compiler does not try to reason precisely about
indices. That is, whenever it sees an index like <code>foo[i]</code>, it just
ignores the index altogether and treats the entire array as a unit
(<code>foo[_]</code>, effectively).  This means that it cannot tell that <code>&amp;mut self[0..mid]</code> is disjoint from <code>&amp;mut self[mid..]</code>. The reason for
this is that reasoning about indices would require a much more
complex type system.</li>
<li>In fact, the <code>[]</code> operator is not builtin to the language when
applied to a range anyhow. It is
<a href="https://github.com/rust-lang/rust/blob/b9a201c6dff196fc759fb1f1d3d292691fc5d99a/src/libcore/slice.rs#L572-L589">implemented in the standard library</a>. Therefore, even if
the compiler knew that <code>0..mid</code> and <code>mid..</code> did not overlap, it
wouldn&rsquo;t necessarily know that <code>&amp;mut self[0..mid]</code> and <code>&amp;mut self[mid..]</code> return disjoint slices.</li>
</ol>
<p>Now, it&rsquo;s plausible that we could extend the type system to make this
example compile, and maybe we&rsquo;ll do that someday. But for the time
being we&rsquo;ve preferred to implement cases like <code>split_at_mut</code> using
unsafe code. This lets us keep the type system simple, while still
enabling us to write APIs like <code>split_at_mut</code>.</p>
<h3 id="abstraction-boundaries">Abstraction boundaries</h3>
<p>Looking at unsafe code as analogous to a plugin helps to clarify the
idea of an <strong>abstraction boundary</strong>. When you write a Ruby plugin, you
expect that when users from Ruby call into your function, they will
supply you with normal Ruby objects and pointers. Internally, you can
play whatever tricks you want: for example, you might use a C array
instead of a Ruby vector. But once you return values back out to the
surrounding Ruby code, you have to repackage up those results as
standard Ruby objects.</p>
<p>It works the same way with unsafe code in Rust. At the public
boundaries of your API, your code should act &ldquo;as if&rdquo; it were any other
safe function. This means you can assume that your users will give you
valid instances of Rust types as inputs. It also means that any values
you return or otherwise output must meet all the requirements that the
Rust type system expects. <em>Within</em> the unsafe boundary, however, you
are free to bend the rules (of course, just <em>how</em> free you are is the
topic of debate; I intend to discuss it in a follow-up post).</p>
<p>Let&rsquo;s look at the <code>split_at_mut</code> method we saw in the previous
section. For our purposes here, we only care about the &ldquo;public
interface&rdquo; of the function, which is its signature:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">]</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">split_at_mut</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">mid</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">],</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">])</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// body of the fn omitted so that we can focus on the
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// public inferface; safe code shouldn&#39;t have to care what
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// goes in here anyway
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">    
</span></span></span></code></pre></div><p>So what can we derive from this signature? To start, <code>split_at_mut</code>
can assume that all of its inputs are &ldquo;valid&rdquo; (for safe code, the
compiler&rsquo;s type system naturally ensures that this is true; unsafe
callers would have to ensure it themselves). Part of writing the rules
for unsafe code will require enumerating more precisely what this
means, but at a high-level it&rsquo;s stuff like this:</p>
<ul>
<li>The <code>self</code> argument is of type <code>&amp;mut [T]</code>. This implies that we will
receive a reference that points at some number <code>N</code> of <code>T</code> elements.
Because this is a mutable reference, we know that the memory it
refers to cannot be accessed via any other alias (until the mutable
reference expires). We also know the memory is initialized and the
values are suitable for the type <code>T</code> (whatever it is).</li>
<li>The <code>mid</code> argument is of type <code>usize</code>. All we know is that it is
some unsigned integer.</li>
</ul>
<p>There is one interesting thing missing from this list,
however. Nothing in the API assures us that <code>mid</code> is actually a legal
index into <code>self</code>. This implies that whatever unsafe code we write
will have to check that.</p>
<p>Next, when <code>split_at_mut</code> returns, it must ensure that its return
value meets the requirements of the signature. This basically means it
must return two valid <code>&amp;mut [T]</code> slices (i.e., pointing at valid
memory, with a length that is not too long). Crucially, since those
slices are both valid at the same time, this implies that the two
slices must be <em>disjoint</em> (that is, pointing at different regions of
memory).</p>
<h3 id="possible-implementations">Possible implementations</h3>
<p>So let&rsquo;s look at a few different implementation strategies for
<code>split_at_mut</code> and evaluate whether they might be valid or not. We
already saw that a pure safe implementation doesn&rsquo;t work. So what if
we implemented it using raw pointers like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">]</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">split_at_mut</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">mid</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">],</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">])</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">use</span><span class="w"> </span><span class="n">std</span>::<span class="n">slice</span>::<span class="n">from_raw_parts_mut</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// The unsafe block gives us access to raw pointer
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// operations. By using an unsafe block, we are claiming
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// that none of the actions below will trigger
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// undefined behavior.
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// get a raw pointer to the first element
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="kd">let</span><span class="w"> </span><span class="n">p</span>: <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// get a pointer to the element `mid`
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="kd">let</span><span class="w"> </span><span class="n">q</span>: <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">p</span><span class="p">.</span><span class="n">offset</span><span class="p">(</span><span class="n">mid</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="kt">isize</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// number of elements after `mid`
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="kd">let</span><span class="w"> </span><span class="n">remainder</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">len</span><span class="p">()</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">mid</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// assemble a slice from 0..mid
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="kd">let</span><span class="w"> </span><span class="n">left</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">from_raw_parts_mut</span><span class="p">(</span><span class="n">p</span><span class="p">,</span><span class="w"> </span><span class="n">mid</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// assemble a slice from mid..
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="kd">let</span><span class="w"> </span><span class="n">right</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">from_raw_parts_mut</span><span class="p">(</span><span class="n">q</span><span class="p">,</span><span class="w"> </span><span class="n">remainder</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">(</span><span class="n">left</span><span class="p">,</span><span class="w"> </span><span class="n">right</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">    
</span></span></span></code></pre></div><p>This is a mostly valid implementation, and in fact fairly close to
what <a href="https://github.com/rust-lang/rust/blob/b9a201c6dff196fc759fb1f1d3d292691fc5d99a/src/libcore/slice.rs#L338-L349">the standard library actually does</a>. However, this
code is making a critical assumption that is not guaranteed by the
input: it is assuming that <code>mid</code> is &ldquo;in range&rdquo;. Nowhere does it check
that <code>mid &lt;= len</code>, which means that the <code>q</code> pointer might be out of
range, and also means that the computation of <code>remainder</code> might
overflow and hence (in release builds, at least by default) wrap
around. <strong>So this implementation is incorrect</strong>, because it requires
more guarantees than what the caller is required to provide.</p>
<p>We could make it correct by adding an assertion that <code>mid</code> is a valid
index (note that the assert macro in Rust always executes, even in
optimized code):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">]</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">split_at_mut</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">mid</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">],</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">])</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">use</span><span class="w"> </span><span class="n">std</span>::<span class="n">slice</span>::<span class="n">from_raw_parts_mut</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// check that `mid` is in range:
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="fm">assert!</span><span class="p">(</span><span class="n">mid</span><span class="w"> </span><span class="o">&lt;=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">len</span><span class="p">());</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// as before, with fewer comments:
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="kd">let</span><span class="w"> </span><span class="n">p</span>: <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="kd">let</span><span class="w"> </span><span class="n">q</span>: <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">p</span><span class="p">.</span><span class="n">offset</span><span class="p">(</span><span class="n">mid</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="kt">isize</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="kd">let</span><span class="w"> </span><span class="n">remainder</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">len</span><span class="p">()</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">mid</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="kd">let</span><span class="w"> </span><span class="n">left</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">from_raw_parts_mut</span><span class="p">(</span><span class="n">p</span><span class="p">,</span><span class="w"> </span><span class="n">mid</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="kd">let</span><span class="w"> </span><span class="n">right</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">from_raw_parts_mut</span><span class="p">(</span><span class="n">q</span><span class="p">,</span><span class="w"> </span><span class="n">remainder</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">(</span><span class="n">left</span><span class="p">,</span><span class="w"> </span><span class="n">right</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">    
</span></span></span></code></pre></div><p>OK, at this point we have basically reproduced the
<a href="https://github.com/rust-lang/rust/blob/b9a201c6dff196fc759fb1f1d3d292691fc5d99a/src/libcore/slice.rs#L338-L349">implementation in the standard library</a> (it uses some
slightly different helpers, but it&rsquo;s the same idea).</p>
<h3 id="extending-the-abstraction-boundary">Extending the abstraction boundary</h3>
<p>Of course, it might happen that we actually <em>wanted</em> to assume <code>mid</code>
that is in bound, rather than checking it. We couldn&rsquo;t do this for the
actual <code>split_at_mut</code>, of course, since it&rsquo;s part of the standard
library. But you could imagine wanting a private helper for safe code
that made this assumption, so as to avoid the runtime cost of a bounds
check. In that case, <code>split_at_mut</code> is <strong>relying on the caller</strong> to
guarantee that <code>mid</code> is in bounds. This means that <code>split_at_mut</code> is
no longer &ldquo;safe&rdquo; to call, because it has additional requirements for
its arguments that must be satisfied in order to guarantee memory
safety.</p>
<p>Rust allows you express the idea of a fn that is not safe to call by
moving the <code>unsafe</code> keyword out of the fn body and into the public
signature. Moving the keyword makes a big difference as to the meaning
of the function: the unsafety is no longer just an <strong>implementation
detail</strong> of the function, it&rsquo;s now part of the <strong>function&rsquo;s
interface</strong>.  So we could make a variant of <code>split_at_mut</code> called
<code>split_at_mut_unchecked</code> that avoids the bounds check:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">]</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Here the **fn** is declared as unsafe; calling such a function is
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// now considered an unsafe action for the caller, because they
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// must guarantee that `mid &lt;= self.len()`.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">unsafe</span><span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">split_at_mut_unchecked</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">mid</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">],</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">])</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">use</span><span class="w"> </span><span class="n">std</span>::<span class="n">slice</span>::<span class="n">from_raw_parts_mut</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">p</span>: <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span><span class="w"> 
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">q</span>: <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">p</span><span class="p">.</span><span class="n">offset</span><span class="p">(</span><span class="n">mid</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="kt">isize</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">remainder</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">len</span><span class="p">()</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">mid</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">left</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">from_raw_parts_mut</span><span class="p">(</span><span class="n">p</span><span class="p">,</span><span class="w"> </span><span class="n">mid</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">right</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">from_raw_parts_mut</span><span class="p">(</span><span class="n">q</span><span class="p">,</span><span class="w"> </span><span class="n">remainder</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">(</span><span class="n">left</span><span class="p">,</span><span class="w"> </span><span class="n">right</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">    
</span></span></span></code></pre></div><p>When a <code>fn</code> is declared as <code>unsafe</code> like this, calling that fn becomes
an <code>unsafe</code> action: what this means in practice is that the caller
must read the documentation of the function and ensure that what
conditions the function requires are met. In this case, it means that
the caller must ensure that <code>mid &lt;= self.len()</code>.</p>
<p>If you think about abstraction boundaries, declaring a fn as <code>unsafe</code>
means that it does not form an abstraction boundary with safe code.
Rather, it becomes part of the unsafe abstraction of the fn that calls
it.</p>
<p>Using <code>split_at_mut_unchecked</code>, we could now re-implemented <code>split_at_mut</code>
to just layer on top the bounds check:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">]</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">split_at_mut</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">mid</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">],</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">])</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="fm">assert!</span><span class="p">(</span><span class="n">mid</span><span class="w"> </span><span class="o">&lt;=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">len</span><span class="p">());</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// By placing the `unsafe` block in the function, we are
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// claiming that we know the extra safety conditions
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// on `split_at_mut_unchecked` are satisfied, and hence calling
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// this function is a safe thing to do.
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">unsafe</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="bp">self</span><span class="p">.</span><span class="n">split_at_mut_unchecked</span><span class="p">(</span><span class="n">mid</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// **NB:** Requires that `mid &lt;= self.len()`.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">unsafe</span><span class="w"> </span><span class="k">fn</span> <span class="nf">split_at_mut_unchecked</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">mid</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">],</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">])</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="c1">// as above
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="unsafe-boundaries-and-privacy">Unsafe boundaries and privacy</h3>
<p>Although there is nothing in the language that <em>explicitly</em> connects
the privacy rules with unsafe abstraction boundaries, they are naturally interconnected. This is because
privacy allows you to control the set of code that can modify your
fields, and this is a basic building block to being able to construct
an unsafe abstraction.</p>
<p>Earlier we mentioned that the <code>Vec</code> type in the standard library is
implemented using unsafe code. This would not be possible without
privacy. If you look at the definition of <code>Vec</code>, it looks something
like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">pointer</span>: <span class="o">*</span><span class="k">mut</span><span class="w"> </span><span class="n">T</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">capacity</span>: <span class="kt">usize</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">length</span>: <span class="kt">usize</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Here the field <code>pointer</code> is a pointer to the start of some
memory. <code>capacity</code> is the amount of memory that has been allocated and
<code>length</code> is the amount of memory that has been initialized.</p>
<p>The vector code is all very careful to maintain the invariant that it
is always safe the first <code>length</code> elements of the the memory that
<code>pointer</code> refers to. You can imagine that if the <code>length</code> field were
public, this would be impossible: anybody from the outside could go
and change the length to whatever they want!</p>
<p>For this reason, unsafety boundaries tend to fall into one of two
categories:</p>
<ul>
<li>a single functions, like <code>split_at_mut</code>
<ul>
<li>this could include unsafe callees like <code>split_at_mut_unchecked</code></li>
</ul>
</li>
<li>a type, typically contained in its own module, like <code>Vec</code>
<ul>
<li>this type will naturally have private helper functions as well</li>
<li>and it may contain unsafe helper types too, as described in
the next section</li>
</ul>
</li>
</ul>
<h3 id="types-with-unsafe-interfaces">Types with unsafe interfaces</h3>
<p>We saw earlier that it can be useful to define <code>unsafe</code> functions like
<code>split_at_mut_unchecked</code>, which can then serve as the building block
for a safe abstraction. The same is true of types. In fact, if you
look at the <a href="https://github.com/rust-lang/rust/blob/cf37af162721f897e6b3565ab368906621955d90/src/libcollections/vec.rs#L272-L275">actual definition</a> of <code>Vec</code> from the standard
library, you will see that it looks just a bit different from what we
saw above:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">buf</span>: <span class="nc">RawVec</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">len</span>: <span class="kt">usize</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>What is this <code>RawVec</code>? Well, that turns out to be an <a href="https://github.com/rust-lang/rust/blob/cf37af162721f897e6b3565ab368906621955d90/src/liballoc/raw_vec.rs">unsafe helper
type</a> that encapsulates the idea of a pointer and a capacity:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">RawVec</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Unique is actually another unsafe helper type
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// that indicates a uniquely owned raw pointer:
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">ptr</span>: <span class="nc">Unique</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">cap</span>: <span class="kt">usize</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>What makes <code>RawVec</code> an &ldquo;unsafe&rdquo; helper type? Unlike with functions,
the idea of an &ldquo;unsafe type&rdquo; is a rather fuzzy notion. I would define
such a type as a type that doesn&rsquo;t really let you do anything useful
without using unsafe code. Safe code can construct <code>RawVec</code>, for example,
and even resize the backing buffer, but if you want to actually access
the data <em>in</em> that buffer, you can only do so by calling
<a href="https://github.com/rust-lang/rust/blob/cf37af162721f897e6b3565ab368906621955d90/src/liballoc/raw_vec.rs#L143-L145">the <code>ptr</code> method</a>, which returns a <code>*mut T</code>. This is a raw
pointer, so dereferencing it is unsafe; which means that, to be
useful, <code>RawVec</code> has to be incorporated into another unsafe
abstraction (like <code>Vec</code>) which tracks initialization.</p>
<h3 id="conclusion">Conclusion</h3>
<p>Unsafe abstractions are a pretty powerful tool. They let you play just
about any dirty performance trick you can think of &ndash; or access any
system capbility &ndash; while still keeping the overall language safe and
relatively simple. We use unsafety to implement a number of the core
abstractions in the standard library, including core data structures
like <code>Vec</code> and <code>Rc</code>. But because all of these abstractions encapsulate
the unsafe code behind their API, users of those modules don&rsquo;t carry
the risk.</p>
<h4 id="how-low-can-you-go">How low can you go?</h4>
<p>One thing I have not discussed in this post is a lot of specifics
about <em>exactly</em> what is legal within unsafe code and not. Clearly, the
point of unsafe code is to bend the rules, but how far can you bend
them before they break? At the moment, we don&rsquo;t have a lot of
published guidelines on this topic. This is something we
<a href="https://github.com/rust-lang/rfcs/issues/1447">aim to address</a>. In fact there has even been a
<a href="https://github.com/rust-lang/rfcs/pull/1578">first RFC</a> introduced on the topic, though I think we can
expect a fair amount of iteration before we arrive at the final and
complete answer.</p>
<p>As I <a href="https://github.com/rust-lang/rfcs/pull/1578#issuecomment-217184537">wrote on the RFC thread</a>, my take is that we should be
shooting for rules that are &ldquo;human friendly&rdquo; as much as possible. In
particular, I think that most people will not read our rules and fewer
still will try to understand them. So we should ensure that the unsafe
code that people write in ignorance of the rules is, by and large,
correct. (This implies also that the majority of the code that exists
ought to be correct.)</p>
<p>Interestingly, there is something of a tension here: the more unsafe
code we allow, the less the compiler can optimize. This is because it
would have to be conservative about possible aliasing and (for
example) avoid reordering statements.</p>
<p>In my next post, I will describe how I think that we can leverage
unsafe abstractions to actually get the best of both worlds. The basic
idea is to aggressively optimized safe code, but be more conservative
within an unsafe abstraction (but allow people to opt back in with
additional annotations).</p>
<p><strong>Edit note:</strong> Tweaked some wording for clarity.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/></entry><entry><title type="html">Non-lexical lifetimes: adding the outlives relation</title><link href="https://smallcultfollowing.com/babysteps/blog/2016/05/09/non-lexical-lifetimes-adding-the-outlives-relation/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2016/05/09/non-lexical-lifetimes-adding-the-outlives-relation/</id><published>2016-05-09T00:00:00+00:00</published><updated>2016-05-09T16:15:58-07:00</updated><content type="html"><![CDATA[<p>This is the third post in my
<a href="http://smallcultfollowing.com/babysteps/blog/categories/nll/">series on non-lexical lifetimes</a>. Here I want to dive into
<strong>Problem Case #3</strong> from the introduction. This is an interesting
case because exploring it is what led me to move away from the
continuous lifetimes proposed as part of <a href="https://github.com/rust-lang/rfcs/pull/396">RFC 396</a>.</p>
<!-- more -->
<h3 id="problem-case-3-revisited">Problem case #3 revisited</h3>
<p>As a reminder, problem case #3 was the following fragment:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">get_default</span><span class="o">&lt;</span><span class="na">&#39;m</span><span class="p">,</span><span class="n">K</span><span class="p">,</span><span class="n">V</span>:<span class="nb">Default</span><span class="o">&gt;</span><span class="p">(</span><span class="n">map</span>: <span class="kp">&amp;</span><span class="na">&#39;m</span> <span class="nc">mut</span><span class="w"> </span><span class="n">HashMap</span><span class="o">&lt;</span><span class="n">K</span><span class="p">,</span><span class="n">V</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                               </span><span class="n">key</span>: <span class="nc">K</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                               </span>-&gt; <span class="kp">&amp;</span><span class="na">&#39;m</span> <span class="nc">mut</span><span class="w"> </span><span class="n">V</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">match</span><span class="w"> </span><span class="n">map</span><span class="p">.</span><span class="n">get_mut</span><span class="p">(</span><span class="o">&amp;</span><span class="n">key</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="c1">// -------------+ &#39;m
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nb">Some</span><span class="p">(</span><span class="n">value</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">value</span><span class="p">,</span><span class="w">              </span><span class="c1">// |
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nb">None</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">                          </span><span class="c1">// |
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">map</span><span class="p">.</span><span class="n">insert</span><span class="p">(</span><span class="n">key</span><span class="p">,</span><span class="w"> </span><span class="n">V</span>::<span class="n">default</span><span class="p">());</span><span class="w"> </span><span class="c1">// |
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//  ^~~~~~ ERROR               // |
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">map</span><span class="p">.</span><span class="n">get_mut</span><span class="p">(</span><span class="o">&amp;</span><span class="n">key</span><span class="p">).</span><span class="n">unwrap</span><span class="p">()</span><span class="w">     </span><span class="c1">// |
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">                                  </span><span class="c1">// |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">                                      </span><span class="c1">// |
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">                                          </span><span class="c1">// v
</span></span></span></code></pre></div><p>What makes this example interesting is that it crosses functions. In
particular, when we call <code>get_mut</code> the first time, if we get back a
<code>Some</code> value, we plan to return the point, and hence the value must
last until the end of the lifetime <code>'m</code> (that is, until some point in
the caller). However, if we get back a <code>None</code> value, we wish to
release the loan immediately, because there is no reference to return.</p>
<p>Many people lack intuition for named lifetime parameters. To help get
some better intuition for what a <em>named lifetime parameter</em> represents,
imagine some caller of <code>get_default</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">get_default_caller</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">map</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">HashMap</span>::<span class="n">new</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">map_ref</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">map</span><span class="p">;</span><span class="w"> </span><span class="c1">// -----------------------+ &#39;m
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">value</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">get_default</span><span class="p">(</span><span class="n">map_ref</span><span class="p">,</span><span class="w"> </span><span class="n">some_key</span><span class="p">());</span><span class="w"> </span><span class="c1">//  |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">use</span><span class="p">(</span><span class="n">value</span><span class="p">);</span><span class="w">                                   </span><span class="c1">//  |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// &lt;----------------------------------------------+
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Here we can see that we first create a reference to <code>map</code> called
<code>map_ref</code> (I pulled this reference into a variable for purposes of
exposition). This variable is passed into <code>get_default</code>, which returns
a reference into the map called <code>value</code>. The important point here is
that the signature of <code>get_default</code> indicates that <code>value</code> is a
reference into the map as well, so that means that the lifetime of
<code>map_ref</code> will also include any uses of <code>value</code>. Therefore, the
lifetime <code>'m</code> winds up extending from the creation of <code>map_ref</code> until
after the call to <code>use(value)</code>.</p>
<h3 id="running-example-inline">Running example: inline</h3>
<p>Although ostensibly problem case #3 is about cross-function use, it
turns out that &ndash; for the purposes of this blog post &ndash; we can create
an equally interesting test case by inlining <code>get_default</code> into the
caller. This will produce the following combined example, which will
be the running example for this post. I&rsquo;ve also taken the liberty of
&ldquo;desugaring&rdquo; the method calls to <code>get_mut</code> a bit, which helps with
explaining what&rsquo;s going on:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">get_default_inlined</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">map</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">HashMap</span>::<span class="n">new</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">key</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">..</span><span class="p">.;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">value</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// this is the body of `get_default`, just inlined
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// and slightly tweaked:
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">map_ref1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">map</span><span class="p">;</span><span class="w"> </span><span class="c1">// --------------------------+ &#39;m1
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">match</span><span class="w"> </span><span class="n">map_ref1</span><span class="p">.</span><span class="n">get_mut</span><span class="p">(</span><span class="o">&amp;</span><span class="n">key</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">                     </span><span class="c1">// |
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nb">Some</span><span class="p">(</span><span class="n">value</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">value</span><span class="p">,</span><span class="w">                          </span><span class="c1">// |
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nb">None</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">                                      </span><span class="c1">// .
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="n">map</span><span class="p">.</span><span class="n">insert</span><span class="p">(</span><span class="n">key</span><span class="p">.</span><span class="n">clone</span><span class="p">(),</span><span class="w"> </span><span class="n">V</span>::<span class="n">default</span><span class="p">());</span><span class="w">     </span><span class="c1">// .
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="kd">let</span><span class="w"> </span><span class="n">map_ref2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">map</span><span class="p">;</span><span class="w">                   </span><span class="c1">// .
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="n">map_ref2</span><span class="p">.</span><span class="n">get_mut</span><span class="p">(</span><span class="o">&amp;</span><span class="n">key</span><span class="p">).</span><span class="n">unwrap</span><span class="p">()</span><span class="w"> </span><span class="c1">// --+ &#39;m2    .
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">}</span><span class="w">                                   </span><span class="c1">//   |        .
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">                                       </span><span class="c1">//   |        |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">};</span><span class="w">                                          </span><span class="c1">//   |        |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">use</span><span class="p">(</span><span class="n">value</span><span class="p">);</span><span class="w">                                 </span><span class="c1">//   |        |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// &lt;---------------------------------------------+--------+
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Written this way, we can see that there are two loans: <code>map_ref1</code> and
<code>map_ref2</code>. Both loans are passed to <code>get_mut</code> and the resulting
reference must last until after the call to <code>use(value)</code> has finished.
I&rsquo;ve depicted the lifetime of the two loans here (and denoted them
<code>'m1</code> and <code>'m2</code>).</p>
<p>Note that, for this fragment to type-check, <code>'m1</code> must <em>exclude</em> the
<code>None</code> arm of the match. I&rsquo;ve denoted this by using a <code>.</code> for that
part of the line. This area must be excluded because, otherwise, the
calls to <code>insert</code> and <code>get_mut</code>, both of which require mutable borrows
of <code>map</code>, would be in conflict with <code>map_ref1</code>.</p>
<p>But if <code>'m1</code> excludes the <code>None</code> part of the match, that means that
control can flow <strong>out</strong> of the region <code>'m1</code> (into the <code>None</code> arm) and
then <strong>back in again</strong> (in the <code>use(value)</code>).</p>
<h3 id="why-rfc-396-alone-cant-handle-this-example">Why RFC 396 alone can&rsquo;t handle this example</h3>
<p>At this point, it&rsquo;s worth revisiting <a href="https://github.com/rust-lang/rfcs/pull/396">RFC 396</a>. RFC 396 was based on
the very clever notion of defining lifetimes based on the dominator
tree. The idea (in my own words here) was that a lifetime consists of
a dominator node (the entry point <code>H</code>) along with a series of of
&ldquo;tails&rdquo; <code>T</code>. The lifetime then consisted of all nodes that were
dominated by <code>H</code> but which dominated one of the tails <code>T</code>. Moreover,
you have as a consistency condition, that for every edge <code>V -&gt; W</code> in
the CFG, if <code>W != H</code> is in the lifetime, then <code>V</code> is in the lifetime.</p>
<p>The RFC&rsquo;s definition is somewhat different but (I believe) equivalent.
It defines a non-lexical lifetime as a set R of vertifes in the CFG,
such that:</p>
<ol>
<li>R is a subtree (i.e. a connected subgraph) of the dominator tree.</li>
<li>If W is a nonroot vertex of R, and <code>V -&gt; W</code> is an edge in the CFG
such that V doesn&rsquo;t strictly dominate W, then V is in R.</li>
</ol>
<p>In the case of our example above, the dominator tree looks like this
(I&rsquo;m labeling the nodes as well):</p>
<ul>
<li>A: <code>let mut map = HashMap::new();</code>
<ul>
<li>B: <code>let key = ...;</code>
<ul>
<li>C: <code>let map_ref1 = &amp;mut map</code>
<ul>
<li>D: <code>map_ref1.get_mut(&amp;key)</code>
<ul>
<li>E: <code>Some(value) =&gt; value</code></li>
<li>F: <code>map.insert(key.clone(), V::default())</code>
<ul>
<li>G: <code>let map_ref2 = &amp;mut map</code>
<ul>
<li>H: <code>map_ref2.get_mut(&amp;key).unwrap()</code></li>
</ul>
</li>
</ul>
</li>
<li>I: <code>use(value)</code></li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
</ul>
<p>Here the lifetime <code>'m1</code> would be a set containing <em>at least</em> {D, E, I}, because the value in
question is used in those places. But then there is an edge in the CFG from H to I,
and thus by rule #2, H must be in <code>'m1</code> as well. But then rule 1 will require that F and G
are in the set, and hence the resulting lifetime will be {D, E, F, G, H, I}. This implies
then that the calls to <code>insert</code> and <code>get_mut</code> are disallowed.</p>
<h3 id="the-outlives-relation-in-light-of-control-flow">The outlives relation in light of control-flow</h3>
<p>In my <a href="http://smallcultfollowing.com/babysteps/blog/2016/05/04/non-lexical-lifetimes-based-on-liveness/">previous post</a>, I defined a lifetime as simply a set of
points in the control-flow graph and showed how we can use liveness to
ensure that references are valid at each point where they are
used. But that is not the full set of constraints we must consider. We
must also consider the <code>'a: 'b</code> constraints that arise as a result of
type-checking as well as where clauses.</p>
<p>The constraint <code>'a: 'b</code> means &ldquo;the lifetime <code>'a</code> outlives <code>'b</code>&rdquo;. It
basically means that <code>'a</code> corresponds to something <em>at least as long
as</em> <code>'b</code> (note that the outlives relation, like many other relations
such as dominators and subtyping, is reflexive &ndash; so it&rsquo;s ok for <code>'a</code>
and <code>'b</code> to be equally big). The intuition here is that, if you a
reference with lifetime <code>'a</code>, it is ok to approximate that lifetime to
something shorter. This corresponds to a subtyping rule like:</p>
<pre><code>'a: 'b
----------------------
&amp;'a mut T &lt;: &amp;'b mut T
</code></pre>
<p>In English, you can approximate a mutable reference of type <code>&amp;'a mut T</code> to a mutable reference of type <code>&amp;'b mut T</code> so long as the new
lifetime <code>'b</code> is shorter than <code>'a</code> (there is a similar, though
different in one particular, rule governing shared references).</p>
<p>We&rsquo;re going to see that for the type system to work most smoothly, we
really want this subtyping relation to be extended to take into
account the <em>point P in the control-flow graph where it must hold</em>. So
we might write a rule like this instead:</p>
<pre><code>('a: 'b) at P
--------------
(&amp;'a mut T &lt;: &amp;'b mut T) at P
</code></pre>
<p>However, let&rsquo;s ignore that for a second and stick to the simpler
version of the subtyping rules that I showed at first. This is
sufficient for the running example. Once we&rsquo;ve fully explored that
I&rsquo;ll come back and show a second example where we run into a spot of
trouble.</p>
<h3 id="running-example-in-pseudo-mir">Running example in pseudo-MIR</h3>
<p>Before we go any further, let&rsquo;s transform our running example into a
more MIR-like form, based on a control-flow graph. I will use the
convention that each basic block ia assigned a letter (e.g., A) and
individual statements (or the terminator, in MIR speak) in the basic
block are named via the block and an index. So <code>A/0</code> is the call to
<code>HashMap::new()</code> and <code>B/2</code> is the <code>goto</code> terminator.</p>
<pre tabindex="0"><code>                A [ map = HashMap::new() ]
                1 [ key = ...            ]
                2 [ goto                 ]
                      |
                      v
                B [ map_ref = &amp;mut map           ]
                1 [ tmp = map_ref1.get_mut(&amp;key) ]
                2 [ switch(tmp)                  ]
                      |          |
                     Some       None
                      |          |
                      v          v
C [ v1 = (tmp as Some).0 ]  D [ map.insert(...)                      ]
1 [ value = v1           ]  1 [ map_ref2 = &amp;mut map                  ]
2 [ goto                 ]  2 [ v2 = map_ref2.get_mut(&amp;key).unwrap() ]
                      |     3 [ value = v2                           ]
                      |     4 [ goto                                 ]
                      |          |
                      v          v
                   E [ use(value) ]
</code></pre><p>Let&rsquo;s assume that the types of all these variables are as follows (I&rsquo;m
simplifying in various respects from what the real MIR would do, just
to keep the number of temporaries and so forth under control):</p>
<ul>
<li><code>map: HashMap&lt;K,V&gt;</code></li>
<li><code>key: K</code></li>
<li><code>map_ref: &amp;'m1 mut HashMap&lt;K,V&gt;</code></li>
<li><code>tmp: Option&lt;&amp;'v1 mut V&gt;</code></li>
<li><code>v1: &amp;'v1 mut V</code></li>
<li><code>value: &amp;'v0 mut V</code></li>
<li><code>map_ref2: &amp;'m2 mut HashMap&lt;K,V&gt;</code></li>
<li><code>v2: &amp;'v2 mut V</code></li>
</ul>
<p>If we type-check the MIR, we will derive (at least) the following
outlives relationships between these lifetimes (these fall out from
the rules on subtyping above; if you&rsquo;re not sure on that point, I have
an explanation below of how it works listed under <em>appendix</em>):</p>
<ul>
<li><code>'m1: 'v1</code> &ndash; because of B/1</li>
<li><code>'m2: 'v2</code> &ndash; because of D/2</li>
<li><code>'v1: 'v0</code> &ndash; because of C/2</li>
<li><code>'v2: 'v0</code> &ndash; beacuse of D/5</li>
</ul>
<p>In addition, the liveness rules will add some inclusion constraints
as well. In particular, the constraints on <code>'v0</code> (the lifetime of the <code>value</code>
reference) will be as follows:</p>
<ul>
<li><code>'v0: E/0</code> &ndash; <code>value</code> is live here</li>
<li><code>'v0: C/2</code> &ndash; <code>value</code> is live here</li>
<li><code>'v0: D/4</code> &ndash; <code>value</code> is live here</li>
</ul>
<p>For now, let&rsquo;s just treat the outlives relation as a &ldquo;superset&rdquo;
relation.  So <code>'m1: 'v1</code>, for example, requires that <code>'m1</code> be a
superset of <code>'v1</code>. In turn, <code>'v0: E/0</code> can be written <code>'v0: {E/0}</code>.
In that case, if we turn the crank and compute some minimal lifetimes
that satisfy the various constraints, we wind up with the following
values for each lifetime:</p>
<ul>
<li><code>'v0 = {C/2, D/4, E/0}</code></li>
<li><code>'v1 = {C/*, D/4, E/0}</code></li>
<li><code>'m1 = {B/*, C/*, E/0, D/4}</code></li>
<li><code>'v2 = {C/2, D/{3,4}, E/0}</code></li>
<li><code>'m2 = {C/2, D/{2,3,4}, E/0}</code></li>
</ul>
<p>This turns out not to yield any errors, but you can see some kind of
surprising results. For example, the lifetime assigned to <code>v1</code> (the
value from the <code>Some</code> arm) includes some points that are in the <code>None</code>
arm &ndash; e.g., D/5. This is because <code>'v1: 'v0</code> (subtyping from the
assignment in C/2) and <code>'v0: {D/5}</code> (liveness). It turns out you can
craft examples where these &ldquo;extra blocks&rdquo; pose a problem.</p>
<h3 id="simple-superset-considered-insufficient">Simple superset considered insufficient</h3>
<p>To see where these extra blocks start to get us into trouble, consider
this example (here I have annotated the types of some variables, as
well as various lifetimes, inline). This is a variation on the
previous theme in which there are two maps. This time, along one
branch, <code>v0</code> will equal this reference <code>v1</code> pointing into <code>map1</code>, but
in the in the <code>else</code> branch, we assign <code>v0</code> from a reference <code>v2</code>
pointing into <code>map2</code>. After that assignment, we try to insert into
<code>map1</code>.  (This might arise for example if <code>map1</code> represents a cache
against some larger <code>map2</code>.)</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">map1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">HashMap</span>::<span class="n">new</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">map2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">HashMap</span>::<span class="n">new</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">key</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">..</span><span class="p">.;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">map_ref1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">map1</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">v1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">map_ref1</span><span class="p">.</span><span class="n">get_mut</span><span class="p">(</span><span class="o">&amp;</span><span class="n">key</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">v0</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">if</span><span class="w"> </span><span class="n">some_condition</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">v0</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">v1</span><span class="p">.</span><span class="n">unwrap</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">map_ref2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">map2</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">v2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">map_ref2</span><span class="p">.</span><span class="n">get_mut</span><span class="p">(</span><span class="o">&amp;</span><span class="n">key</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">v0</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">v2</span><span class="p">.</span><span class="n">unwrap</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">map1</span><span class="p">.</span><span class="n">insert</span><span class="p">(</span><span class="o">..</span><span class="p">.);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">use</span><span class="p">(</span><span class="n">v0</span><span class="p">);</span><span class="w">
</span></span></span></code></pre></div><p>Let&rsquo;s view this in CFG form::</p>
<pre tabindex="0"><code>                A [ map1 = HashMap::new()       ]
                1 [ map2 = HashMap::new()       ]
                2 [ key: K = ...                ]
                3 [ map_ref1 = &amp;mut map1        ]
                4 [ v1 = map_ref1.get_mut(&amp;key) ]
                5 [ if some_condition           ]
                          |               |
                         true           false
                          |               |
                          v               v
      B [ v0 = v1.unwrap() ]   C [ map_ref2 = &amp;mut map2        ]
      1 [ goto             ]   1 [ v2 = map_ref2.get_mut(&amp;key) ]
                          |    2 [ v0 = v2.unwrap()            ]
                          |    3 [ map1.insert(...)            ]
                          |    4 [ goto                        ]
                          |               |
                          v               v
                        D [ use(v0)       ]
</code></pre><p>The types of the interesting variables are as follows:</p>
<ul>
<li><code>v0: &amp;'v0 mut V</code></li>
<li><code>map_ref1: &amp;'m1 mut HashMap&lt;K,V&gt;</code></li>
<li><code>v1: Option&lt;&amp;'v1 mut V&gt;</code></li>
<li><code>map_ref2: &amp;'m2 mut HashMap&lt;K,V&gt;</code></li>
<li><code>v2: Option&lt;&amp;'v2 mut V&gt;</code></li>
</ul>
<p>The outlives relations that result from type-checking this fragment are as follows:</p>
<ul>
<li><code>'m1: 'v1</code> from A/4</li>
<li><code>'v1: 'v0</code> from B/0</li>
<li><code>'m2: 'v2</code> from C/1</li>
<li><code>'v2: 'v0</code> from C/2</li>
<li><code>'v0: {B/1, C/3, C/4, D/0}</code> from liveness of <code>v0</code></li>
<li><code>'m1: {A/3, A/4}</code> from liveness of <code>map_ref1</code></li>
<li><code>'v1: {A/5, B/0}</code> from liveness of <code>v1</code></li>
<li><code>'m2: {C/0, C/1}</code> from liveness of <code>map_ref2</code></li>
<li><code>'v2: {C/2}</code> from liveness of <code>v2</code></li>
</ul>
<p>Following the simple &ldquo;outlives is superset rules we&rsquo;ve covered so far,
this in turn implies the lifetime <code>'m1</code> would be <code>{A/3, A/4, B/*, C/3, C/4, D/0}</code>. Note that this includes <code>C/3</code>, precisely where we <em>would</em>
call <code>map1.insert</code>, which means we will get an error at this point.</p>
<h3 id="location-area-outlives">Location-area outlives</h3>
<p>What I propose as a solution is to have the outlives relationship take
into account the current position. As I sketched above, the rough idea
is that the <code>'a: 'b</code> relationship becomes <code>('a: 'b) at P</code> &ndash; meaning
that <code>'a</code> must outlive <code>'b</code> <em>at the point P</em>. We can define this
relation as follows:</p>
<ul>
<li>let S be the set of all points in <code>'b</code> reachable from P,
<ul>
<li>without passing through the entry point of <code>'b</code>
<ul>
<li>reminder: the entry point of <code>'b</code> is the mutual dominator of all points in <code>'b</code></li>
</ul>
</li>
</ul>
</li>
<li>if <code>'a</code> is a superset of <code>S</code>,</li>
<li>then <code>('a: 'b) at P</code></li>
</ul>
<p>Basically, the idea is that <code>('a: 'b) at P</code> means that, given that we
have arrived at point P, any points that we can reach from here that
are still in <code>'b</code> are also in <code>'a</code>.</p>
<p>If we apply this new definition to the outlives constraints from the
previous section, we see a key difference in the result. In
particular, the assignment <code>v0 = v1.unwrap()</code> in B/0 generates the
constraint <code>('v1: 'v0) at B/0</code>. <code>'v0</code> is <code>{B/1, C/3, C/4, D/0}</code>.
Before, this meant that <code>'v1</code> must include <code>C/3</code> and <code>C/4</code>, but now we
can screen those out because they are not reachable from <code>B/0</code> (at
least, not without calling the enclosing function again). Therefore,
the result is that <code>'v1</code> becomes <code>{A/5, B/0, B/1, D/0}</code>, and hence
<code>'m1</code> becomes <code>{A/3, A/4, A/5, B/0, B/1, D/0}</code> &ndash; notably, it no
longer includes C/3, and hence no error is reported.</p>
<h3 id="conclusion-and-some-discussion-of-alternative-approaches">Conclusion and some discussion of alternative approaches</h3>
<p>This post dug in some detail into how we can define the outlives
relationship between lifetimes. Interestingly, in order to support the
examples we want to support, when we move to NLL, we have to be able
to support <em>gaps</em> in lifetimes. In all the examples in this post, the
key idea was that we want to exit the lifetime when we enter one
branch of a conditional, but then &ldquo;re-enter&rdquo; it afterwards when we
join control-flow after the conditional. This works out ok because we
know that, when we exit the first-time, all references with that
lifetime are dead (or else the lifetime would have to include that
exit point).</p>
<p>There is another way to view it: one can view a lifetime as a set of
<em>paths</em> through the control-flow graph, in which case the points after
the <code>match</code> or after the <code>if</code> would appear on only on paths that
happened to pass through the right arm of the match. They are
&ldquo;conditionally included&rdquo;, in other words, depending on how
control-flow proceeded.</p>
<p>One downside of this approach is that it requires augmenting the
subtyping relationship with a location. I don&rsquo;t see this causing a
problem, but it&rsquo;s not something I&rsquo;ve seen before. We&rsquo;ll have to see as
we go. It might e.g. affect caching.</p>
<h3 id="comments">Comments</h3>
<p>Please comment on
<a href="http://internals.rust-lang.org/t/non-lexical-lifetimes-based-on-liveness/3428/">this internals thread</a>.</p>
<h3 id="appendix-a-an-alternative-variables-have-multiple-types">Appendix A: An alternative: variables have multiple types</h3>
<p>There is another alternative to lifetimes with gaps that we might
consider. We might also consider allow variables to have multiple
types.  I explored this a bit by using an SSA-like renaming, where
each verson assignment to a variable yielded a fresh type. However, I
thought that in the end it felt more complicated than just allowing
lifetimes to have gaps; for one thing, it complicates determining
whether two paths overlap in the borrow checker (different versions of
the same variable are still stored in the same lvalue), and it doesn&rsquo;t
interact as well with the notion of <em>fragments</em> that I talked about in
<a href="http://smallcultfollowing.com/babysteps/blog/2016/05/04/non-lexical-lifetimes-based-on-liveness/">the previous post</a> (though one can use variants of SSA that
operate on fragments, I suppose). Still, it may be worth exploring &ndash;
and there more precedent for that in the literature, to be sure.  One
advantage of that approach is that one can use &ldquo;continuous lifetimes&rdquo;,
I think, which may be easier to represent in a compact fashion &ndash; on
the other hand, you have a lot more lifetime variables, so that may
not be a win. (Also, I think you still need the outlives relationship
to be location-dependent.)</p>
<h3 id="appendix-b-how-subtyping-links-the-lifetime-of-arguments-and-the-return-value">Appendix B: How subtyping links the lifetime of arguments and the return value</h3>
<p>Given the definition of the <code>get_mut</code> method, the compiler is able to
see that the reference which gets returned is reborrowed from the
<code>self</code> argument. That is, the compiler can see that as long as you are using
the return value, you are (indirectly) using the <code>self</code> reference
as well. This is indicated by the named lifetime parameter <code>'v</code> that
appears in the definition of <code>get_mut</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">get_mut</span><span class="o">&lt;</span><span class="na">&#39;v</span><span class="o">&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="na">&#39;v</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">key</span>: <span class="kp">&amp;</span><span class="nc">Key</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;&amp;</span><span class="na">&#39;v</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">V</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>There are various ways to think of this signature, and in particular
the named lifetime parameter <code>'v</code>. The most intuitive explanation is
this parameter indicates that the return value is &ldquo;borrowed from&rdquo; the
<code>self</code> argument (because they share the same lifetime <code>'v</code>). Hence we
could conclude that when we call <code>tmp = map_ref1.get_mut(&amp;key)</code>, the
lifetime of the input (<code>'m1</code>) must outlive the lifetime of the output
(<code>'v1</code>). Written using outlives notation, that would be that this call
requires that <code>'m1: 'v1</code>. This is the right conclusion, but it may be
worth digging a bit more into how the type system actually works
internally.</p>
<p>Specifically, the way the type system works, is that when <code>get_mut</code> is
called, to find the signature at that particular callsite, we replace
the lifetime parameter <code>'v</code> is replaced with a new inference variable
(let&rsquo;s call it <code>'0</code>). So at the point where <code>tmp = map_ref1.get_mut(&amp;key)</code>
is called, the signature of <code>get_mut</code> is effectively:</p>
<pre><code>fn(self: &amp;'0 mut HashMap&lt;K,V&gt;,
   key: &amp;'1 K)
   -&gt; Option&lt;&amp;'0 mut V&gt;
</code></pre>
<p>Here you can see that the <code>self</code> parameter is treated like any other
explicit argument, and that the lifetime of the key reference (now
made explicit as <code>'1</code>) is an independent variable from the lifetime of
the <code>self</code> reference. Next we would require that the type of each
supplied argument must be a subtype of what appears in the signature.
In particular, for the <code>self</code> argument, that results in this
requirement:</p>
<pre><code>&amp;'m1 mut HashMap&lt;K,V&gt; &lt;: &amp;'0 mut HashMap&lt;K,V&gt;
</code></pre>
<p>from which we can conclude that <code>'m1: '0</code> must hold. Finally, we
require that the declared return type of the function must be a
subtype of the type of the variable where the return value is stored,
and hence:</p>
<pre><code>         Option&lt;&amp;'0 mut V&gt; &lt;: Option&lt;&amp;'v1 mut V&gt;
implies: &amp;'0 mut V &lt;: &amp;'v1 mut V
implies: '0: 'v1
</code></pre>
<p>So the end result from all of these subtype operations is that we have
two outlives relations:</p>
<pre><code>'m1: '0
'0: 'v
</code></pre>
<p>These in turn imply an indirect relationship between <code>'m1</code> and <code>'v</code>:</p>
<pre><code>`'m1: 'v1`
</code></pre>
<p>This final relationship is, of course, precisely what our intuition led
us to in the first place: the lifetime of the reference to the map
must outlive the lifetime of the returned value.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/nll" term="nll" label="NLL"/></entry><entry><title type="html">Non-lexical lifetimes based on liveness</title><link href="https://smallcultfollowing.com/babysteps/blog/2016/05/04/non-lexical-lifetimes-based-on-liveness/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2016/05/04/non-lexical-lifetimes-based-on-liveness/</id><published>2016-05-04T00:00:00+00:00</published><updated>2016-05-04T05:19:04-04:00</updated><content type="html"><![CDATA[<p>In my <a href="http://smallcultfollowing.com/babysteps/blog/2016/04/27/non-lexical-lifetimes-introduction/">previous post</a> I outlined several cases that we would like
to improve with Rust&rsquo;s current borrow checker. This post discusses one
possible scheme for solving those. The heart of the post is two key ideas:</p>
<ol>
<li>Define a <strong>lifetime</strong> as a <strong>set of points in the control-flow
graph</strong>, where a <strong>point</strong> here refers to some particular statement
in the control-flow graph (i.e., not a <a href="https://en.wikipedia.org/wiki/Control_flow_graph">basic block</a>, but some
statement within a basic block).</li>
<li>Use <strong>liveness</strong> as the basis for deciding where a variable&rsquo;s type
must be valid.</li>
</ol>
<p>The rest of this post expounds on these two ideas and shows how they
affect the various examples from the previous post.</p>
<!-- more -->
<h3 id="problem-case-1-references-assigned-into-a-variable">Problem case #1: references assigned into a variable</h3>
<p>To see better what these two ideas mean &ndash; and why we need both of
them &ndash; let&rsquo;s look at the initial example from <a href="http://smallcultfollowing.com/babysteps/blog/2016/04/27/non-lexical-lifetimes-introduction/">my previous post</a>.
Here we are storing a reference to <code>&amp;mut data[..]</code> into the variable
<code>slice</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">bar</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[</span><span class="sc">&#39;a&#39;</span><span class="p">,</span><span class="w"> </span><span class="sc">&#39;b&#39;</span><span class="p">,</span><span class="w"> </span><span class="sc">&#39;c&#39;</span><span class="p">];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">slice</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">data</span><span class="p">[</span><span class="o">..</span><span class="p">];</span><span class="w"> </span><span class="c1">// &lt;-+ lifetime today
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">capitalize</span><span class="p">(</span><span class="n">slice</span><span class="p">);</span><span class="w">         </span><span class="c1">//   |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="sc">&#39;d&#39;</span><span class="p">);</span><span class="w"> </span><span class="c1">// ERROR!  //   |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="sc">&#39;e&#39;</span><span class="p">);</span><span class="w"> </span><span class="c1">// ERROR!  //   |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="sc">&#39;f&#39;</span><span class="p">);</span><span class="w"> </span><span class="c1">// ERROR!  //   |
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w"> </span><span class="c1">// &lt;------------------------------+
</span></span></span></code></pre></div><p>As shown, the lifetime of this reference today winds up being the
subset of the block that starts at the <code>let</code> and stretches until the
ending <code>}</code>. This results in compilation errors when we attempt to push
to <code>data</code>.  The reason is that a borrow like <code>&amp;mut data[..]</code>
effectively &ldquo;locks&rdquo; the <code>data[..]</code> for the lifetime of the borrow,
meaning that <code>data</code> becomes off limits and can&rsquo;t be used (this
&ldquo;locking&rdquo; is just a metaphor for the type system rules; there is of
course nothing happening at runtime).</p>
<p>What we would like is to observe that <code>slice</code> is <em>dead</em> &ndash; which is
<a href="https://en.wikipedia.org/wiki/Live_variable_analysis">compiler-speak</a> for &ldquo;it won&rsquo;t ever be used again&rdquo; &ndash; after the call to
<code>capitalize</code>. Therefore, if we had a more flexible lifetime system, we
might compute the lifetime of the <code>slice</code> reference to something that
ends right after the call to <code>capitalize</code>, like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">bar</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[</span><span class="sc">&#39;a&#39;</span><span class="p">,</span><span class="w"> </span><span class="sc">&#39;b&#39;</span><span class="p">,</span><span class="w"> </span><span class="sc">&#39;c&#39;</span><span class="p">];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">slice</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">data</span><span class="p">[</span><span class="o">..</span><span class="p">];</span><span class="w"> </span><span class="c1">// &lt;-+ lifetime under this proposal
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">capitalize</span><span class="p">(</span><span class="n">slice</span><span class="p">);</span><span class="w">         </span><span class="c1">//   |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// &lt;----------------------------+
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="sc">&#39;d&#39;</span><span class="p">);</span><span class="w"> </span><span class="c1">// OK
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="sc">&#39;e&#39;</span><span class="p">);</span><span class="w"> </span><span class="c1">// OK
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="sc">&#39;f&#39;</span><span class="p">);</span><span class="w"> </span><span class="c1">// OK
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>If we had this shorter lifetime, then the calls to <code>data.push</code> would
be legal, since the &ldquo;lock&rdquo; is effectively released early.</p>
<p>At first it might seem like all we have to do to achieve this result
is to adjust the definition of what a lifetime can be to make it more
flexible. In particular, today, once a lifetime must extend beyond the
boundaries of a single statement (e.g., beyond the <code>let</code> statement
here), it must extend all the way till the end of the enclosing block.
So, by adopting a definition of lifetimes that is just &ldquo;a set of
points in the control-flow graph&rdquo;, we lift this constraint, and we can
now express the idea of a lifetime that starts at the <code>&amp;mut data[..]</code>
borrow and ends after the call to <code>capitalize</code>, which we couldn&rsquo;t even
express before.</p>
<p>But it turns out that is not quite enough. There is another rule in
the type system today that causes us a problem. This rule states that
the type of a variable must outlive the variable&rsquo;s scope. In other
words, if a variable contains a reference, that reference must be
valid for the entire scope of the variable. So, in our example above,
the reference created by the <code>&amp;mut data[..]</code> borrow winds up being
stored in the variable <code>slice</code>. This means that the lifetime of that
reference must include the scope of <code>slice</code> &ndash; which stretches from
the <code>let</code> until the closing <code>}</code>. In other words, even if we adopt more
flexible lifetimes, if we change nothing else, we wind up with the
same lifetime as before.</p>
<p>You might think we could just remove the rule altogether, and say that
the lifetime of a reference must include all the points where the
lifetime is used, with no special treatment for references stored into
variables. In this particular example we&rsquo;ve been looking at, that
would do the right thing: the lifetime of <code>slice</code> would only have to
outlive the call to <code>capitalize</code>. But it starts to go wrong if the
control-flow gets more complicated:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">baz</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[</span><span class="sc">&#39;a&#39;</span><span class="p">,</span><span class="w"> </span><span class="sc">&#39;b&#39;</span><span class="p">,</span><span class="w"> </span><span class="sc">&#39;c&#39;</span><span class="p">];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">slice</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">data</span><span class="p">[</span><span class="o">..</span><span class="p">];</span><span class="w"> </span><span class="c1">// &lt;-+ lifetime if we ignored
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">loop</span><span class="w"> </span><span class="p">{</span><span class="w">                     </span><span class="c1">//   | variables altogether
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">capitalize</span><span class="p">(</span><span class="n">slice</span><span class="p">);</span><span class="w">     </span><span class="c1">//   |
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// &lt;------------------------+
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">data</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="sc">&#39;d&#39;</span><span class="p">);</span><span class="w"> </span><span class="c1">// Should be error, but would not be.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="sc">&#39;e&#39;</span><span class="p">);</span><span class="w"> </span><span class="c1">// OK
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="sc">&#39;f&#39;</span><span class="p">);</span><span class="w"> </span><span class="c1">// OK
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Here again the reference <code>slice</code> is <em>still</em> only be required to live
until after the call to <code>capitalize</code>, since that is the only place it
is used. However, in this variation, that is not the correct behavior:
the reference <code>slice</code> is in fact still <a href="https://en.wikipedia.org/wiki/Live_variable_analysis">live</a> after the call to
capitalize, since it will be used again in the next iteration of the
loop. <strong>The problem here is that we are entering the lifetime (after
the call to <code>capitalize</code>) and then re-entering it (on the loop
backedge) but without reinitializing <code>slice</code>.</strong></p>
<p>One way to address this problem would be to modify the definition of a
lifetime. The definition I gave earlier was very flexible and allowed
any set of points in the control-flow to be included. Perhaps we want
some special rules to ensure that control flow is continuous? This is
the approach that <a href="https://github.com/rust-lang/rfcs/pull/396">RFC 396</a> took, for example. I initially explored
this approach but found that it caused problems with more advanced
cases, such as a variation on problem case 3 we will examine in a
later post.</p>
<p>(<strong>EDITED:</strong> The paragraph above incorrectly suggested that
<a href="https://github.com/rust-lang/rfcs/pull/396">RFC 396</a> had special rules around backedges. Edited to clarify.)</p>
<p>Instead, I have opted to weaken &ndash; but not entirely remove &ndash; the
original rule.  The original rule was something like this (expressed
as an <a href="https://en.wikipedia.org/wiki/Rule_of_inference">inference rule</a>):</p>
<pre><code>scope(x) = 's
T: 's
------------------
let x: T OK
</code></pre>
<p>In other words, it&rsquo;s ok to declare a variable <code>x</code> with type <code>T</code>, as
long as <code>T</code> outlive the scope <code>'s</code> of that variable. My new version is more like
this:</p>
<pre><code>live-range(x) = 's
T: 's
------------------
let x: T OK
</code></pre>
<p>Here I have substituted <em>live-range</em> for <em>scope</em>. By <a href="https://en.wikipedia.org/wiki/Live_variable_analysis">live-range</a>
I mean &ldquo;the set of points in the CFG where <code>x</code> may be later used&rdquo;,
effectively. If we apply this to our two variations, we will see that,
in the first example, the variable <code>slice</code> is <em>dead</em> after the call to
capitalize: it will never be used again. But in the second variation,
the one with a loop, <code>slice</code> is <em>live</em>, because it may be used in the
next iteration. This accounts for the different behavior:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// Variation #1: `slice` is dead after call to capitalize,
</span></span></span><span class="line"><span class="cl"><span class="c1">// so the lifetime ends
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">bar</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[</span><span class="sc">&#39;a&#39;</span><span class="p">,</span><span class="w"> </span><span class="sc">&#39;b&#39;</span><span class="p">,</span><span class="w"> </span><span class="sc">&#39;c&#39;</span><span class="p">];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">slice</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">data</span><span class="p">[</span><span class="o">..</span><span class="p">];</span><span class="w"> </span><span class="c1">// &lt;-+ lifetime under this proposal
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">capitalize</span><span class="p">(</span><span class="n">slice</span><span class="p">);</span><span class="w">         </span><span class="c1">//   |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// &lt;----------------------------+
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="sc">&#39;d&#39;</span><span class="p">);</span><span class="w"> </span><span class="c1">// OK
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="sc">&#39;e&#39;</span><span class="p">);</span><span class="w"> </span><span class="c1">// OK
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="sc">&#39;f&#39;</span><span class="p">);</span><span class="w"> </span><span class="c1">// OK
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// Variation #2: `slice` is live after call to capitalize,
</span></span></span><span class="line"><span class="cl"><span class="c1">// so the lifetime encloses the entire loop.
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">baz</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[</span><span class="sc">&#39;a&#39;</span><span class="p">,</span><span class="w"> </span><span class="sc">&#39;b&#39;</span><span class="p">,</span><span class="w"> </span><span class="sc">&#39;c&#39;</span><span class="p">];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">slice</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">data</span><span class="p">[</span><span class="o">..</span><span class="p">];</span><span class="w"> </span><span class="c1">// &lt;---------------------------+
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">loop</span><span class="w"> </span><span class="p">{</span><span class="w">                                               </span><span class="c1">//   |
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">capitalize</span><span class="p">(</span><span class="n">slice</span><span class="p">);</span><span class="w">                               </span><span class="c1">//   |
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">data</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="sc">&#39;d&#39;</span><span class="p">);</span><span class="w"> </span><span class="c1">// ERROR!                        //   |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">                                                    </span><span class="c1">//   |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// &lt;------------------------------------------------------+
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// But note that `slice` is dead here, so the lifetime ends:
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="sc">&#39;e&#39;</span><span class="p">);</span><span class="w"> </span><span class="c1">// OK
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="sc">&#39;f&#39;</span><span class="p">);</span><span class="w"> </span><span class="c1">// OK
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="refining-the-proposal-using-fragments">Refining the proposal using fragments</h3>
<p>One problem with the analysis as I presented it thus far is that it is
based on liveness of individual variables. This implies that we lose
precision when references are moved into structs or tuples. So, for
example, while this bit of code <em>will</em> type-check:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">data1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">data2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">data1</span><span class="p">[</span><span class="o">..</span><span class="p">];</span><span class="w"> </span><span class="c1">// &lt;--+ data1 is &#34;locked&#34; here
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">data2</span><span class="p">[</span><span class="o">..</span><span class="p">];</span><span class="w"> </span><span class="c1">// &lt;----+ data2 is &#34;locked&#34; here
</span></span></span><span class="line"><span class="cl"><span class="k">use</span><span class="p">(</span><span class="n">x</span><span class="p">);</span><span class="w">                 </span><span class="c1">//    | |
</span></span></span><span class="line"><span class="cl"><span class="c1">// &lt;--------------------------+ |
</span></span></span><span class="line"><span class="cl"><span class="n">data1</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="mi">1</span><span class="p">);</span><span class="w">          </span><span class="c1">//      |
</span></span></span><span class="line"><span class="cl"><span class="k">use</span><span class="p">(</span><span class="n">y</span><span class="p">);</span><span class="w">                 </span><span class="c1">//      |
</span></span></span><span class="line"><span class="cl"><span class="c1">// &lt;----------------------------+
</span></span></span><span class="line"><span class="cl"><span class="n">data2</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="mi">1</span><span class="p">);</span><span class="w">
</span></span></span></code></pre></div><p>It would cause errors if we move those two references into a tuple:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">data1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">data2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">tuple</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">data1</span><span class="p">[</span><span class="o">..</span><span class="p">],</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">data2</span><span class="p">[</span><span class="o">..</span><span class="p">]);</span><span class="w"> </span><span class="c1">// &lt;--+ data1 and data2
</span></span></span><span class="line"><span class="cl"><span class="k">use</span><span class="p">(</span><span class="n">tuple</span><span class="p">.</span><span class="mi">0</span><span class="p">);</span><span class="w">                                 </span><span class="c1">//    | are locked here
</span></span></span><span class="line"><span class="cl"><span class="n">data1</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="mi">1</span><span class="p">);</span><span class="w">                                </span><span class="c1">//    |
</span></span></span><span class="line"><span class="cl"><span class="k">use</span><span class="p">(</span><span class="n">tuple</span><span class="p">.</span><span class="mi">1</span><span class="p">);</span><span class="w">                                 </span><span class="c1">//    |
</span></span></span><span class="line"><span class="cl"><span class="c1">// &lt;------------------------------------------------+
</span></span></span><span class="line"><span class="cl"><span class="n">data2</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="mi">1</span><span class="p">);</span><span class="w">
</span></span></span></code></pre></div><p>This is because the variable <code>tuple</code> is live until after the last
field access. <em>However,</em> the <a href="https://github.com/rust-lang/rfcs/blob/master/text/0320-nonzeroing-dynamic-drop.md">dynamic drop</a> analysis is
already computing a set of <em>fragments</em>, which are basically minimal
paths that it needs to retain full resolution around which subparts of
a struct or tuple have been moved. We could probably use similar logic
to determine that we ought to compute the liveness of <code>tuple.0</code> and
<code>tuple.1</code> independently, which would make this example type-check.
(If we did so, then any use of <code>tuple</code> would be considered a &ldquo;gen&rdquo; of
both <code>tuple.0</code> and <code>tuple.1</code>, and any write to <code>tuple</code> would be
considered a &ldquo;kill&rdquo; of both.) This would probably subsume and be
compatible with the fragment logic used for <a href="https://github.com/rust-lang/rfcs/blob/master/text/0320-nonzeroing-dynamic-drop.md">dynamic drop</a>, so it
could be a net simplification.</p>
<h3 id="destructors">Destructors</h3>
<p>One further wrinkle that I did not discuss is that any struct with a
destructor encounters special rules. This is because the destructor
may access the references in the struct. These rules were specified in
<a href="https://github.com/rust-lang/rfcs/blob/master/text/1238-nonparametric-dropck.md">RFC 1238</a> but are colloquially called
<a href="https://github.com/rust-lang/rfcs/blob/master/text/1238-nonparametric-dropck.md">&ldquo;dropck&rdquo;</a>. They basically state that when we create some
variable <code>x</code> whose type <code>T</code> has a destructor, then <code>T</code> must outlive
the <em>parent</em> scope of <code>x</code>. That is, the references in <code>x</code> don&rsquo;t have
to just be valid for the scope of <code>x</code>, they have to be valid for
<em>longer</em> than the scope of <code>x</code>.</p>
<p>In some sense, the dropck rules remains unchanged by all I&rsquo;ve
discussed here. But in another sense dropck may stop being a special
case. The reason is that, in <a href="http://blog.rust-lang.org/2016/04/19/MIR.html">MIR</a>, all drops are made explicit in
the <a href="https://en.wikipedia.org/wiki/Control_flow_graph">control-flow graph</a>, and hence if a variable <code>x</code> has a
destructor, that should show us as &ldquo;just another use&rdquo; of <code>x</code>, and thus
cause the lifetime of any references within to be naturally extended
to cover that destructor. I admit I haven&rsquo;t had time to dig into a lot
of examples here: destructors are historically a very subtle case.</p>
<h3 id="implementation-ramifications">Implementation ramifications</h3>
<p>Those of you familiar with the compiler will realize that there is a
bit of a chicken-and-egg problem with what I have presented
here. Today, the compiler computes the lifetimes of all references in
the <code>typeck</code> pass, which is basically the main type-checking pass that
computes the types of all expressions. We then use the output of this
pass to construct MIR. But in this proposal I am defining lifetimes as
a set of points in the MIR control-flow-graph. What gives?</p>
<p>To make this work, we have to change how the compiler works
internally.  The rough idea is that the <code>typeck</code> pass will no longer
concern itself with regions: it will erase all regions, just as trans
does. This has a number of ancillary benefits, though it also carries
a few complications we have to resolve (maybe a good topic for another
blog post!). We&rsquo;ll then build MIR from this, and hence the initially
constructed MIR will also have no lifetime information (just erased
lifetimes).</p>
<p>Then, looking at each function in the program in turn, we&rsquo;ll do a
safety analysis. We&rsquo;ll start by computing lifetimes &ndash; at this point,
we have the MIR CFG in hand, so we can easily base them on the
CFG. We&rsquo;ll then run the borrowck.  When we are done, we can just
forget about the lifetimes entirely, since all later passes are just
doing optimization and code generation, and they don&rsquo;t care about
lifetimes.</p>
<p>Another interesting question is how to represent lifetimes in the
compiler. The most obvious representation is just to use a bit-set,
but since these lifetimes would require one bit for every statement
within a function, they could grow quite big. There are a number of
ways we could optimize the representation: for example, we could track
the mutual dominator, even promoting it &ldquo;upwards&rdquo; to the innermost
enclosing loop, and only store bits for that subportion of the
graph. This would require fewer bits but it&rsquo;d be a lot more
accounting. I&rsquo;m sure there are other far more clever options as well.
The first step I think would be to gather some statistics about the
size of functions, the number of inference variables per fn, and so
forth.</p>
<p>In any case, a key observation is that, since we only need to store
lifetimes for one function at a time, and only until the end of
borrowck, the precise size is not nearly as important as it would be
today.</p>
<h3 id="conclusion">Conclusion</h3>
<p>Here I presented the key ideas of my current thoughts around
non-lexical lifetimes: using flexible lifetimes coupled with
liveness. I motivated this by examining problem case #1 from
<a href="http://smallcultfollowing.com/babysteps/blog/2016/04/27/non-lexical-lifetimes-introduction/">my introduction</a>. I also covered some of the implementation
complications. In future posts, I plan to examine problem cases #2
and #3 &ndash; and in particular to describe how to extend the system to
cover named lifetime parameters, which I&rsquo;ve completely ignored
here. (Spoiler alert: problem cases #2 and #3 are also no longer
problems under this system.)</p>
<p>I also do want to emphasize that this plan is a
&ldquo;work-in-progress&rdquo;. Part of my hope in posting it is that people will
point out flaws or opportunities for improvement. So I wouldn&rsquo;t be
surprised if the final system we wind up with winds up looking quite
different.</p>
<p>(As is my wont lately, I am disabling comments on this post. If you&rsquo;d
like to discuss the ideas in here, please do so in
<a href="http://internals.rust-lang.org/t/non-lexical-lifetimes-based-on-liveness/3428">this internals thread</a> instead.)</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/nll" term="nll" label="NLL"/></entry><entry><title type="html">Non-lexical lifetimes: introduction</title><link href="https://smallcultfollowing.com/babysteps/blog/2016/04/27/non-lexical-lifetimes-introduction/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2016/04/27/non-lexical-lifetimes-introduction/</id><published>2016-04-27T00:00:00+00:00</published><updated>2016-04-27T07:52:05-07:00</updated><content type="html"><![CDATA[<p>Over the last few weeks, I&rsquo;ve been devoting my free time to fleshing
out the theory behind <strong>non-lexical lifetimes</strong> (NLL). I think I&rsquo;ve
arrived at a pretty good point and I plan to write various posts
talking about it. Before getting into the details, though, I wanted to
start out with a post that lays out roughly how today&rsquo;s <em>lexical
lifetimes</em> work and gives several examples of problem cases that we
would like to solve.</p>
<!-- more -->
<p>The basic idea of the borrow checker is that values may not be mutated
or moved while they are borrowed. But how do we know whether a value
is borrowed? The idea is quite simple: whenever you create a borrow,
the compiler assigns the resulting reference a <strong>lifetime</strong>. This
lifetime corresponds to the span of the code where the reference may
be used. The compiler will infer this lifetime to be the smallest
lifetime that it can that still encompasses all the uses of the
reference.</p>
<p>Note that Rust uses the term lifetime in a very particular way.  In
everyday speech, the word lifetime can be used in two distinct &ndash; but
similar &ndash; ways:</p>
<ol>
<li>The lifetime of a <strong>reference</strong>, corresponding to the span of time in
which that reference is <strong>used</strong>.</li>
<li>The lifetime of a <strong>value</strong>, corresponding to the span of time
before that value gets <strong>freed</strong> (or, put another way, before the
destructor for the value runs).</li>
</ol>
<p>This second span of time, which describes how long a value is valid,
is of course very important. We refer to that span of time as the
value&rsquo;s <strong>scope</strong>. Naturally, lifetimes and scopes are linked to one
another. Specifically, if you make a reference to a value, the
lifetime of that reference cannot outlive the scope of that value,
Otherwise your reference would be pointing into free memory.</p>
<p>To better see the distinction between lifetime and scope, let&rsquo;s
consider a simple example. In this example, the vector <code>data</code> is
borrowed (mutably) and the resulting reference is passed to a function
<code>capitalize</code>. Since <code>capitalize</code> does not return the reference back,
the <em>lifetime</em> of this borrow will be confined to just that call. The
<em>scope</em> of data, in contrast, is much larger, and corresponds to a
suffix of the fn body, stretching from the <code>let</code> until the end of the
enclosing scope.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[</span><span class="sc">&#39;a&#39;</span><span class="p">,</span><span class="w"> </span><span class="sc">&#39;b&#39;</span><span class="p">,</span><span class="w"> </span><span class="sc">&#39;c&#39;</span><span class="p">];</span><span class="w"> </span><span class="c1">// --+ &#39;scope
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">capitalize</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">data</span><span class="p">[</span><span class="o">..</span><span class="p">]);</span><span class="w">          </span><span class="c1">//   |
</span></span></span><span class="line"><span class="cl"><span class="c1">//  ^~~~~~~~~~~~~~~~~~~~~~~~~ &#39;lifetime //   |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="sc">&#39;d&#39;</span><span class="p">);</span><span class="w">                     </span><span class="c1">//   |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="sc">&#39;e&#39;</span><span class="p">);</span><span class="w">                     </span><span class="c1">//   |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="sc">&#39;f&#39;</span><span class="p">);</span><span class="w">                     </span><span class="c1">//   |
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w"> </span><span class="c1">// &lt;---------------------------------------+
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">capitalize</span><span class="p">(</span><span class="n">data</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="p">[</span><span class="kt">char</span><span class="p">])</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// do something
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This example also demonstrates something else. Lifetimes in Rust today
are quite a bit more flexible than scopes (if not as flexible as we
might like, hence this RFC):</p>
<ul>
<li>A scope generally corresponds to some block (or, more specifically,
a <em>suffix</em> of a block that stretches from the <code>let</code> until the end of
the enclosing block) [<a href="#temporaries">1</a>].</li>
<li>A lifetime, in contrast, can also span an individual expression, as
this example demonstrates. The lifetime of the borrow in the example
is confined to just the call to <code>capitalize</code>, and doesn&rsquo;t extend
into the rest of the block. This is why the calls to <code>data.push</code>
that come below are legal.</li>
</ul>
<p>So long as a reference is only used within one statement, today&rsquo;s
lifetimes are typically adequate. Problems arise however when you have
a reference that spans multiple statements. In that case, the compiler
requires the lifetime to be the innermost expression (which is often a
block) that encloses both statements, and that is typically much
bigger than is really necessary or desired. Let&rsquo;s look at some example
problem cases. Later on, we&rsquo;ll see how non-lexical lifetimes fixes
these cases.</p>
<h4 id="problem-case-1-references-assigned-into-a-variable">Problem case #1: references assigned into a variable</h4>
<p>One common problem case is when a reference is assigned into a
variable. Consider this trivial variation of the previous example,
where the <code>&amp;mut data[..]</code> slice is not passed directly to
<code>capitalize</code>, but is instead stored into a local variable:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">bar</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[</span><span class="sc">&#39;a&#39;</span><span class="p">,</span><span class="w"> </span><span class="sc">&#39;b&#39;</span><span class="p">,</span><span class="w"> </span><span class="sc">&#39;c&#39;</span><span class="p">];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">slice</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">data</span><span class="p">[</span><span class="o">..</span><span class="p">];</span><span class="w"> </span><span class="c1">// &lt;-+ &#39;lifetime
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">capitalize</span><span class="p">(</span><span class="n">slice</span><span class="p">);</span><span class="w">         </span><span class="c1">//   |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="sc">&#39;d&#39;</span><span class="p">);</span><span class="w"> </span><span class="c1">// ERROR!  //   |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="sc">&#39;e&#39;</span><span class="p">);</span><span class="w"> </span><span class="c1">// ERROR!  //   |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="sc">&#39;f&#39;</span><span class="p">);</span><span class="w"> </span><span class="c1">// ERROR!  //   |
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w"> </span><span class="c1">// &lt;------------------------------+
</span></span></span></code></pre></div><p>The way that the compiler currently works, assigning a reference into
a variable means that its lifetime must be as large as the entire
scope of that variable. In this case, that means the lifetime is now
extended all the way until the end of the block. This in turn means
that the calls to <code>data.push</code> are now in error, because they occur
during the lifetime of <code>slice</code>. It&rsquo;s logical, but it&rsquo;s annoying.</p>
<p>In this particular case, you could resolve the problem by putting
<code>slice</code> into its own block:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">bar</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[</span><span class="sc">&#39;a&#39;</span><span class="p">,</span><span class="w"> </span><span class="sc">&#39;b&#39;</span><span class="p">,</span><span class="w"> </span><span class="sc">&#39;c&#39;</span><span class="p">];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">slice</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">data</span><span class="p">[</span><span class="o">..</span><span class="p">];</span><span class="w"> </span><span class="c1">// &lt;-+ &#39;lifetime
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">capitalize</span><span class="p">(</span><span class="n">slice</span><span class="p">);</span><span class="w">         </span><span class="c1">//   |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w"> </span><span class="c1">// &lt;------------------------------+
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="sc">&#39;d&#39;</span><span class="p">);</span><span class="w"> </span><span class="c1">// OK
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="sc">&#39;e&#39;</span><span class="p">);</span><span class="w"> </span><span class="c1">// OK
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">data</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="sc">&#39;f&#39;</span><span class="p">);</span><span class="w"> </span><span class="c1">// OK
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Since we introduced a new block, the scope of <code>slice</code> is now smaller,
and hence the resulting lifetime is smaller. Of course, introducing a
block like this is kind of artificial and also not an entirely obvious
solution.</p>
<h4 id="problem-case-2-conditional-control-flow">Problem case #2: conditional control flow</h4>
<p>Another common problem case is when references are used in only match
arm. This most commonly arises around maps. Consider this function,
which, given some <code>key</code>, processes the value found in <code>map[key]</code> if it
exists, or else inserts a default value:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">process_or_default</span><span class="o">&lt;</span><span class="n">K</span><span class="p">,</span><span class="n">V</span>:<span class="nb">Default</span><span class="o">&gt;</span><span class="p">(</span><span class="n">map</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">HashMap</span><span class="o">&lt;</span><span class="n">K</span><span class="p">,</span><span class="n">V</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                                   </span><span class="n">key</span>: <span class="nc">K</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">match</span><span class="w"> </span><span class="n">map</span><span class="p">.</span><span class="n">get_mut</span><span class="p">(</span><span class="o">&amp;</span><span class="n">key</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="c1">// -------------+ &#39;lifetime
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nb">Some</span><span class="p">(</span><span class="n">value</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">process</span><span class="p">(</span><span class="n">value</span><span class="p">),</span><span class="w">     </span><span class="c1">// |
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nb">None</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">                          </span><span class="c1">// |
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">map</span><span class="p">.</span><span class="n">insert</span><span class="p">(</span><span class="n">key</span><span class="p">,</span><span class="w"> </span><span class="n">V</span>::<span class="n">default</span><span class="p">());</span><span class="w"> </span><span class="c1">// |
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//  ^~~~~~ ERROR.              // |
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">                                  </span><span class="c1">// |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w"> </span><span class="c1">// &lt;------------------------------------+
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This code will not compile today. The reason is that the <code>map</code> is
borrowed as part of the call to <code>get_mut</code>, and that borrow must
encompass not only the call to <code>get_mut</code>, but also the <code>Some</code> branch
of the match. The innermost expression that encloses both of these
expressions is the match itself (as depicted above), and hence the
borrow is considered to extend until the end of the
match. Unfortunately, the match encloses not only the <code>Some</code> branch,
but also the <code>None</code> branch, and hence when we go to insert into the
map in the <code>None</code> branch, we get an error that the <code>map</code> is still
borrowed.</p>
<p>This <em>particular</em> example is relatively easy to workaround. One can
(frequently) move the code for <code>None</code> out from the <code>match</code> like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">process_or_default1</span><span class="o">&lt;</span><span class="n">K</span><span class="p">,</span><span class="n">V</span>:<span class="nb">Default</span><span class="o">&gt;</span><span class="p">(</span><span class="n">map</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">HashMap</span><span class="o">&lt;</span><span class="n">K</span><span class="p">,</span><span class="n">V</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                                    </span><span class="n">key</span>: <span class="nc">K</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">match</span><span class="w"> </span><span class="n">map</span><span class="p">.</span><span class="n">get_mut</span><span class="p">(</span><span class="o">&amp;</span><span class="n">key</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="c1">// -------------+ &#39;lifetime
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nb">Some</span><span class="p">(</span><span class="n">value</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">                   </span><span class="c1">// |
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">process</span><span class="p">(</span><span class="n">value</span><span class="p">);</span><span class="w">                </span><span class="c1">// |
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">return</span><span class="p">;</span><span class="w">                        </span><span class="c1">// |
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">                                  </span><span class="c1">// |
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nb">None</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">                          </span><span class="c1">// |
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">                                  </span><span class="c1">// |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w"> </span><span class="c1">// &lt;------------------------------------+
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">map</span><span class="p">.</span><span class="n">insert</span><span class="p">(</span><span class="n">key</span><span class="p">,</span><span class="w"> </span><span class="n">V</span>::<span class="n">default</span><span class="p">());</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>When the code is adjusted this way, the call to <code>map.insert</code> is not
part of the match, and hence it is not part of the borrow.  While this
works, it is of course unfortunate to require these sorts of
manipulations, just as it was when we introduced an artificial block
in the previous example.</p>
<h4 id="problem-case-3-conditional-control-flow-across-functions">Problem case #3: conditional control flow across functions</h4>
<p>While we were able to work around problem case #2 in a relatively
simple, if irritating, fashion. there are other variations of
conditional control flow that cannot be so easily resolved. This is
particularly true when you are returning a reference out of a
function. Consider the following function, which returns the value for
a key if it exists, and inserts a new value otherwise (for the
purposes of this section, assume that the <code>entry</code> API for maps does
not exist):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">get_default</span><span class="o">&lt;</span><span class="na">&#39;m</span><span class="p">,</span><span class="n">K</span><span class="p">,</span><span class="n">V</span>:<span class="nb">Default</span><span class="o">&gt;</span><span class="p">(</span><span class="n">map</span>: <span class="kp">&amp;</span><span class="na">&#39;m</span> <span class="nc">mut</span><span class="w"> </span><span class="n">HashMap</span><span class="o">&lt;</span><span class="n">K</span><span class="p">,</span><span class="n">V</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                               </span><span class="n">key</span>: <span class="nc">K</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                               </span>-&gt; <span class="kp">&amp;</span><span class="na">&#39;m</span> <span class="nc">mut</span><span class="w"> </span><span class="n">V</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">match</span><span class="w"> </span><span class="n">map</span><span class="p">.</span><span class="n">get_mut</span><span class="p">(</span><span class="o">&amp;</span><span class="n">key</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="c1">// -------------+ &#39;m
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nb">Some</span><span class="p">(</span><span class="n">value</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">value</span><span class="p">,</span><span class="w">              </span><span class="c1">// |
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nb">None</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">                          </span><span class="c1">// |
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">map</span><span class="p">.</span><span class="n">insert</span><span class="p">(</span><span class="n">key</span><span class="p">,</span><span class="w"> </span><span class="n">V</span>::<span class="n">default</span><span class="p">());</span><span class="w"> </span><span class="c1">// |
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//  ^~~~~~ ERROR               // |
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">map</span><span class="p">.</span><span class="n">get_mut</span><span class="p">(</span><span class="o">&amp;</span><span class="n">key</span><span class="p">).</span><span class="n">unwrap</span><span class="p">()</span><span class="w">     </span><span class="c1">// |
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">                                  </span><span class="c1">// |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">                                      </span><span class="c1">// |
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">                                          </span><span class="c1">// v
</span></span></span></code></pre></div><p>At first glance, this code appears quite similar the code we saw
before. And indeed, just as before, it will not compile. But in fact
the lifetimes at play are quite different. The reason is that, in the
<code>Some</code> branch, the value is being <strong>returned out</strong> to the caller.
Since <code>value</code> is a reference into the map, this implies that the <code>map</code>
will remain borrowed <strong>until some point in the caller</strong> (the point
<code>'m</code>, to be exact). To get a better intuition for what this lifetime
parameter <code>'m</code> represents, consider some hypothetical caller of
<code>get_default</code>: the lifetime <code>'m</code> then represents the span of code in
which that caller will use the resulting reference:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">caller</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">map</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">HashMap</span>::<span class="n">new</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">v</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">get_default</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="n">map</span><span class="p">,</span><span class="w"> </span><span class="n">key</span><span class="p">);</span><span class="w"> </span><span class="c1">// -+ &#39;m
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="c1">// +-- get_default() -----------+ //  |
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="c1">// | match map.get_mut(&amp;key) {  | //  |
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="c1">// |   Some(value) =&gt; value,    | //  |
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="c1">// |   None =&gt; {                | //  |
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="c1">// |     ..                     | //  |
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="c1">// |   }                        | //  |
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="c1">// +----------------------------+ //  |
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">process</span><span class="p">(</span><span class="n">v</span><span class="p">);</span><span class="w">                         </span><span class="c1">//  |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w"> </span><span class="c1">// &lt;--------------------------------------+
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>If we attempt the same workaround for this case that we tried
in the previous example, we will find that it does not work:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">get_default1</span><span class="o">&lt;</span><span class="na">&#39;m</span><span class="p">,</span><span class="n">K</span><span class="p">,</span><span class="n">V</span>:<span class="nb">Default</span><span class="o">&gt;</span><span class="p">(</span><span class="n">map</span>: <span class="kp">&amp;</span><span class="na">&#39;m</span> <span class="nc">mut</span><span class="w"> </span><span class="n">HashMap</span><span class="o">&lt;</span><span class="n">K</span><span class="p">,</span><span class="n">V</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                                </span><span class="n">key</span>: <span class="nc">K</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                                </span>-&gt; <span class="kp">&amp;</span><span class="na">&#39;m</span> <span class="nc">mut</span><span class="w"> </span><span class="n">V</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">match</span><span class="w"> </span><span class="n">map</span><span class="p">.</span><span class="n">get_mut</span><span class="p">(</span><span class="o">&amp;</span><span class="n">key</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="c1">// -------------+ &#39;m
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nb">Some</span><span class="p">(</span><span class="n">value</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="k">return</span><span class="w"> </span><span class="n">value</span><span class="p">,</span><span class="w">       </span><span class="c1">// |
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nb">None</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">}</span><span class="w">                        </span><span class="c1">// |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">                                      </span><span class="c1">// |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">map</span><span class="p">.</span><span class="n">insert</span><span class="p">(</span><span class="n">key</span><span class="p">,</span><span class="w"> </span><span class="n">V</span>::<span class="n">default</span><span class="p">());</span><span class="w">         </span><span class="c1">// |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//  ^~~~~~ ERROR (still)                  |
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">map</span><span class="p">.</span><span class="n">get_mut</span><span class="p">(</span><span class="o">&amp;</span><span class="n">key</span><span class="p">).</span><span class="n">unwrap</span><span class="p">()</span><span class="w">             </span><span class="c1">// |
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">                                          </span><span class="c1">// v
</span></span></span></code></pre></div><p>Whereas before the lifetime of <code>value</code> was confined to the match, this
new lifetime extends out into the caller, and therefore the borrow
does not end just because we exited the match. Hence it is still in
scope when we attempt to call <code>insert</code> after the match.</p>
<p>The workaround for this problem is a bit more involved. It relies on
the fact that the borrow checker uses the precise control-flow of the
function to determine what borrows are in scope.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">get_default2</span><span class="o">&lt;</span><span class="na">&#39;m</span><span class="p">,</span><span class="n">K</span><span class="p">,</span><span class="n">V</span>:<span class="nb">Default</span><span class="o">&gt;</span><span class="p">(</span><span class="n">map</span>: <span class="kp">&amp;</span><span class="na">&#39;m</span> <span class="nc">mut</span><span class="w"> </span><span class="n">HashMap</span><span class="o">&lt;</span><span class="n">K</span><span class="p">,</span><span class="n">V</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                                </span><span class="n">key</span>: <span class="nc">K</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                                </span>-&gt; <span class="kp">&amp;</span><span class="na">&#39;m</span> <span class="nc">mut</span><span class="w"> </span><span class="n">V</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">if</span><span class="w"> </span><span class="n">map</span><span class="p">.</span><span class="n">contains</span><span class="p">(</span><span class="o">&amp;</span><span class="n">key</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// ^~~~~~~~~~~~~~~~~~ &#39;n
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">return</span><span class="w"> </span><span class="k">match</span><span class="w"> </span><span class="n">map</span><span class="p">.</span><span class="n">get_mut</span><span class="p">(</span><span class="o">&amp;</span><span class="n">key</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="c1">// + &#39;m
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nb">Some</span><span class="p">(</span><span class="n">value</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">value</span><span class="p">,</span><span class="w">        </span><span class="c1">// |
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nb">None</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="fm">unreachable!</span><span class="p">()</span><span class="w">       </span><span class="c1">// |
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">};</span><span class="w">                               </span><span class="c1">// v
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// At this point, `map.get_mut` was never
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// called! (As opposed to having been called,
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// but its result no longer being in use.)
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">map</span><span class="p">.</span><span class="n">insert</span><span class="p">(</span><span class="n">key</span><span class="p">,</span><span class="w"> </span><span class="n">V</span>::<span class="n">default</span><span class="p">());</span><span class="w"> </span><span class="c1">// OK now.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">map</span><span class="p">.</span><span class="n">get_mut</span><span class="p">(</span><span class="o">&amp;</span><span class="n">key</span><span class="p">).</span><span class="n">unwrap</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>What has changed here is that we moved the call to <code>map.get_mut</code>
inside of an <code>if</code>, and we have set things up so that the if body
unconditionally returns. What this means is that a borrow begins at
the point of <code>get_mut</code>, and that borrow lasts until the point <code>'m</code> in
the caller, but the borrow checker can see that this borrow <em>will not
have even started</em> outside of the <code>if</code>. So it does not consider the
borrow in scope at the point where we call <code>map.insert</code>.</p>
<p>This workaround is more troublesome than the others, because the
resulting code is actually less efficient at runtime, since it must do
multiple lookups.</p>
<p>It&rsquo;s worth noting that Rust&rsquo;s hashmaps include an <code>entry</code> API that
one could use to implement this function today. The resulting code is
both nicer to read and more efficient even than the original version,
since it avoids extra lookups on the &ldquo;not present&rdquo; path as well:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">get_default3</span><span class="o">&lt;</span><span class="na">&#39;m</span><span class="p">,</span><span class="n">K</span><span class="p">,</span><span class="n">V</span>:<span class="nb">Default</span><span class="o">&gt;</span><span class="p">(</span><span class="n">map</span>: <span class="kp">&amp;</span><span class="na">&#39;m</span> <span class="nc">mut</span><span class="w"> </span><span class="n">HashMap</span><span class="o">&lt;</span><span class="n">K</span><span class="p">,</span><span class="n">V</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                                </span><span class="n">key</span>: <span class="nc">K</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                                </span>-&gt; <span class="kp">&amp;</span><span class="na">&#39;m</span> <span class="nc">mut</span><span class="w"> </span><span class="n">V</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">map</span><span class="p">.</span><span class="n">entry</span><span class="p">(</span><span class="n">key</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">       </span><span class="p">.</span><span class="n">or_insert_with</span><span class="p">(</span><span class="o">||</span><span class="w"> </span><span class="n">V</span>::<span class="n">default</span><span class="p">())</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Regardless, the problem exists for other data structures besides
<code>HashMap</code>, so it would be nice if the original code passed the borrow
checker, even if in practice using the <code>entry</code> API would be
preferable. (Interestingly, the limitation of the borrow checker here
was one of the motivations for developing the <code>entry</code> API in the first
place!)</p>
<h3 id="conclusion">Conclusion</h3>
<p>This post looked at various examples of Rust code that do not compile
today, and showed how they can be fixed using today&rsquo;s system. While
it&rsquo;s good that workarounds exist, it&rsquo;d be better if the code just
compiled as is. In an upcoming post, I will outline my plan for how to
modify the compiler to achieve just that.</p>
<h2 id="endnotes">Endnotes</h2>
<p><a name="temporaries"></a></p>
<p><strong>1.</strong> Scopes always correspond to blocks with one exception: the
scope of a temporary value is sometimes the enclosing
statement.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/nll" term="nll" label="NLL"/></entry><entry><title type="html">Nice errors in LALRPOP</title><link href="https://smallcultfollowing.com/babysteps/blog/2016/03/02/nice-errors-in-lalrpop/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2016/03/02/nice-errors-in-lalrpop/</id><published>2016-03-02T00:00:00+00:00</published><updated>2016-03-02T12:58:49-05:00</updated><content type="html"><![CDATA[<p>For the last couple of weeks, my mornings have been occupied with a
pretty serious revamping of <a href="http://smallcultfollowing.com/babysteps/blog/2015/09/14/lalrpop/">LALRPOP&rsquo;s</a> error message output. I will
probably wind up doing a series of blog posts about the internal
details of how it works, but I wanted to write a little post to
advertise this work.</p>
<p>Typically when you use an LR(1) parser generator, error messages tend
to be written in terms of the LR(1) state generation algorithm.  They
use phrases like &ldquo;shift/reduce conflict&rdquo; and talk about LR(1)
items. Ultimately, you have to do some clever thinking to relate the
error to your grammar, and then a bit more clever thinking to figure
out how you should adjust your grammar to make the problem go away.
While working on <a href="https://github.com/nikomatsakis/rustypop">adapting the Rust grammar to LALRPOP</a>, I
found I was wasting a lot of time trying to decrypt the error
messages, and I wanted to do something about it. This work
is the result.</p>
<p><strong>An aside:</strong> It&rsquo;s definitely worth citing <a href="http://gallium.inria.fr/~fpottier/menhir/">Menhir</a> as an inspiration,
which is an awesome parser generator for OCaml. Menhir offers a lot of
the same features that LALRPOP does, and in particular generates
errors very similar to those I am talking about here.</p>
<p>What I&rsquo;ve tried to do now in LALRPOP is to do that clever thinking for
you, and instead present the error message in terms of your
grammar. Perhaps even more importantly, I&rsquo;ve also tried to <strong>identify
common beginner problems and suggest solutions</strong>. Naturally this is a
work-in-progress, but I&rsquo;m already pretty excited with the current
status, so I wanted to write up some examples of it in action.</p>
<!-- more -->
<h3 id="diagnosing-ambiguous-grammars">Diagnosing ambiguous grammars</h3>
<p>Let&rsquo;s start with an example of a truly ambiguous grammar. Imagine that
I have this grammar for a simple calculator (in LALRPOP syntax, which
I hope will be mostly self explanatory):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">use</span><span class="w"> </span><span class="n">std</span>::<span class="kt">str</span>::<span class="n">FromStr</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">grammar</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="n">Expr</span>: <span class="kt">i32</span> <span class="o">=</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">&lt;</span><span class="n">n</span>:<span class="nc">r</span><span class="s">&#34;[0-9]+&#34;</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="kt">i32</span>::<span class="n">from_str</span><span class="p">(</span><span class="n">n</span><span class="p">).</span><span class="n">unwrap</span><span class="p">(),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">&lt;</span><span class="n">l</span>:<span class="nc">Expr</span><span class="o">&gt;</span><span class="w"> </span><span class="s">&#34;+&#34;</span><span class="w"> </span><span class="o">&lt;</span><span class="n">r</span>:<span class="nc">Expr</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">l</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">r</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">&lt;</span><span class="n">l</span>:<span class="nc">Expr</span><span class="o">&gt;</span><span class="w"> </span><span class="s">&#34;-&#34;</span><span class="w"> </span><span class="o">&lt;</span><span class="n">r</span>:<span class="nc">Expr</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">l</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">r</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">&lt;</span><span class="n">l</span>:<span class="nc">Expr</span><span class="o">&gt;</span><span class="w"> </span><span class="s">&#34;*&#34;</span><span class="w"> </span><span class="o">&lt;</span><span class="n">r</span>:<span class="nc">Expr</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">l</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">r</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">&lt;</span><span class="n">l</span>:<span class="nc">Expr</span><span class="o">&gt;</span><span class="w"> </span><span class="s">&#34;/&#34;</span><span class="w"> </span><span class="o">&lt;</span><span class="n">r</span>:<span class="nc">Expr</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="n">l</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">r</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">};</span><span class="w">
</span></span></span></code></pre></div><p>This grammar evaluates expressions like <code>1 + 2 * 3</code> and yields a
32-bit integer as the result. The problem is that this grammar is
quite ambiguous: it does not encode the precedence of the various
operators in any particular way. The older versions of LALRPOP gave
you a rather opaque error concerning shift/reduce conflicts. As of version
0.10, though, you get this (the actual output even <a href="http://imgur.com/nHdMXt5">uses ANSI colors</a>,
if available):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="cl">calc.lalrpop:6:5: 6:34: Ambiguous grammar detected
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  The following symbols can be reduced in two ways:
</span></span><span class="line"><span class="cl">    Expr &#34;*&#34; Expr &#34;*&#34; Expr
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  They could be reduced like so:
</span></span><span class="line"><span class="cl">    Expr &#34;*&#34; Expr &#34;*&#34; Expr
</span></span><span class="line"><span class="cl">    ├─Expr──────┘        │
</span></span><span class="line"><span class="cl">    └─Expr───────────────┘
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  Alternatively, they could be reduced like so:
</span></span><span class="line"><span class="cl">    Expr &#34;*&#34; Expr &#34;*&#34; Expr
</span></span><span class="line"><span class="cl">    │        └─Expr──────┤
</span></span><span class="line"><span class="cl">    └─Expr───────────────┘
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  Hint: This looks like a precedence error related to `Expr`. See the LALRPOP
</span></span><span class="line"><span class="cl">  manual for advice on encoding precedence.
</span></span></code></pre></div><p>Much clearer, I&rsquo;d say! And note, if you look at the last sentence,
that LALRPOP is even able to diagnose that this an ambiguity specifically
about <strong>precedence</strong> and refer you to the manual &ndash; now, if only I&rsquo;d
<strong>written</strong> the LALRPOP manual, we&rsquo;d be all set.</p>
<p>I should mention that LALRPOP also reports several other errors, all
of which are related to the precedence. For example, it will also
report:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="cl">/Users/nmatsakis/tmp/prec-calc.lalrpop:6:5: 6:34: Ambiguous grammar detected
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  The following symbols can be reduced in two ways:
</span></span><span class="line"><span class="cl">    Expr &#34;*&#34; Expr &#34;+&#34; Expr
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  They could be reduced like so:
</span></span><span class="line"><span class="cl">    Expr &#34;*&#34; Expr &#34;+&#34; Expr
</span></span><span class="line"><span class="cl">    ├─Expr──────┘        │
</span></span><span class="line"><span class="cl">    └─Expr───────────────┘
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  Alternatively, they could be reduced like so:
</span></span><span class="line"><span class="cl">    Expr &#34;*&#34; Expr &#34;+&#34; Expr
</span></span><span class="line"><span class="cl">    │        └─Expr──────┤
</span></span><span class="line"><span class="cl">    └─Expr───────────────┘
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  LALRPOP does not yet support ambiguous grammars. See the LALRPOP manual for
</span></span><span class="line"><span class="cl">  advice on making your grammar unambiguous.
</span></span></code></pre></div><p>The code for detecting precedence errors however doesn&rsquo;t consider
errors between two distinct tokens (here, <code>*</code> and <code>+</code>), so you don&rsquo;t
get a specific message, just a general note about ambiguity. This
seems like an area that would be nice to improve.</p>
<h3 id="diagnosing-lr1-limitations-and-suggesting-inlining">Diagnosing LR(1) limitations and suggesting inlining</h3>
<p>That last example was a case where the grammar was fundamentally
ambiguous. But sometimes there are problems that have to do with how
LR(1) parsing works; diagnosing these nicely is even more important,
because they are less intuitive to the end user. Also, LALRPOP has
several tools that can help make dealing with these problems easier,
so where possible we&rsquo;d really like to suggest these tools to users.</p>
<p>Let&rsquo;s start with a grammar for parsing Java import declarations.
Java&rsquo;s import declarations have this form:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-java" data-lang="java"><span class="line"><span class="cl"><span class="kn">import</span><span class="w"> </span><span class="nn">java.util.*</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kn">import</span><span class="w"> </span><span class="nn">java.lang.String</span><span class="p">;</span><span class="w">
</span></span></span></code></pre></div><p>A first attempt at writing a grammar for them might look like this (in
this grammar, I gave all of the nonterminals the type <code>()</code>, so there
is no need for action code; this means that this grammar does not
build a parse tree, and so it can only be used to decide if the input
is legal Java or not):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">grammar</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="n">ImportDecl</span>: <span class="p">()</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="s">&#34;import&#34;</span><span class="w"> </span><span class="n">Path</span><span class="w"> </span><span class="s">&#34;;&#34;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="s">&#34;import&#34;</span><span class="w"> </span><span class="n">Path</span><span class="w"> </span><span class="s">&#34;.&#34;</span><span class="w"> </span><span class="s">&#34;*&#34;</span><span class="w"> </span><span class="s">&#34;;&#34;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">Path</span>: <span class="p">()</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Ident</span><span class="w"> </span><span class="p">(</span><span class="s">&#34;.&#34;</span><span class="w"> </span><span class="n">Ident</span><span class="p">)</span><span class="o">*</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">Ident</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="sa">r</span><span class="s">#&#34;[a-zA-Z][a-zA-Z0-9]*&#34;#</span><span class="p">;</span><span class="w">
</span></span></span></code></pre></div><p>Now, unlike before, this grammar is unambiguous. Nonetheless, if we
try to run it through LALRPOP, we will get the following error:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="cl">java.lalrpop:8:12: 8:29: Local ambiguity detected
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  The problem arises after having observed the following symbols in the input:
</span></span><span class="line"><span class="cl">    &#34;import&#34; Ident
</span></span><span class="line"><span class="cl">  At that point, if the next token is a `&#34;.&#34;`, then the parser can proceed in
</span></span><span class="line"><span class="cl">  two different ways.
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  First, the parser could execute the production at java.lalrpop:8:12: 8:29,
</span></span><span class="line"><span class="cl">  which would consume the top 1 token(s) from the stack and produce a `Path`.
</span></span><span class="line"><span class="cl">  This might then yield a parse tree like
</span></span><span class="line"><span class="cl">    &#34;import&#34; Ident  ╷ &#34;.&#34; &#34;*&#34; &#34;;&#34;
</span></span><span class="line"><span class="cl">    │        └─Path─┘           │
</span></span><span class="line"><span class="cl">    └─ImportDecl────────────────┘
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  Alternatively, the parser could shift the `&#34;.&#34;` token and later use it to
</span></span><span class="line"><span class="cl">  construct a `(&#34;.&#34; Ident)+`. This might then yield a parse tree like
</span></span><span class="line"><span class="cl">    Ident &#34;.&#34;        Ident
</span></span><span class="line"><span class="cl">    │     └─(&#34;.&#34; Ident)+─┤
</span></span><span class="line"><span class="cl">    └─Path───────────────┘
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  Hint: It appears you could resolve this problem by adding the annotation
</span></span><span class="line"><span class="cl">  `#[inline]` to the definition of `Path`. For more information, see the section
</span></span><span class="line"><span class="cl">  on inlining in the LALROP manual.
</span></span></code></pre></div><p>What&rsquo;s interesting is that, in this case, the grammar is not actually
ambiguous. For any given string, there is only one possible parse. The
problem though is that the grammar <strong>as it is written</strong> requires more
than one token of lookahead. To understand why, you have to think like
an LR(1) parser &ndash; which really isn&rsquo;t as complicated as it sounds. As
usually happens with computers, the hard part is not understanding how
<strong>wicked smart</strong> the LR(1) algorithm is, it&rsquo;s understanding just how
<strong>plain dumb</strong> it is.</p>
<p>Basically, the way an LR(1) parser works is that it takes one token at
a time from your input and tries to match up what it has seen so far
against the productions in your grammar. If it finds a match, it can
<strong>reduce</strong>, which basically means that it can &ldquo;recognize&rdquo; the last few
tokens as something larger. But, and this is the key point, it can
only do a reduction when it is at exactly the right point in the
input. So, for example, consider the definition of <code>ImportDecl</code>:</p>
<pre tabindex="0"><code>pub ImportDecl: () = {
    &#34;import&#34; Path &#34;;&#34;,
    &#34;import&#34; Path &#34;.&#34; &#34;*&#34; &#34;;&#34;,
};
</code></pre><p>Imagine that we are parsing an input like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-java" data-lang="java"><span class="line"><span class="cl"><span class="kn">import</span><span class="w"> </span><span class="nn">foo.bar.*</span><span class="p">;</span><span class="w">
</span></span></span></code></pre></div><p>The first thing that would happen then is that we would see an
<code>&quot;import&quot;</code> token. An <code>&quot;import&quot;</code> is the <em>start</em> of an <code>ImportDecl</code>, but
it alone is not enough to say for sure if we have a valid <code>ImportDecl</code>
yet. So we would push it on the stack. The next token is an identifier
(<code>&quot;foo&quot;</code>). We don&rsquo;t see any identifiers listed in the definition of <code>ImportDecl</code>,
but we <em>do</em> see a <code>Path</code>, and a <code>Path</code> is defined like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">Path</span>: <span class="p">()</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Ident</span><span class="w"> </span><span class="p">(</span><span class="s">&#34;.&#34;</span><span class="w"> </span><span class="n">Ident</span><span class="p">)</span><span class="o">*</span><span class="p">;</span><span class="w">
</span></span></span></code></pre></div><p>So maybe this identifier is the start of a <code>Path</code>. Still, too early to
say for sure. We would then push the identifier onto the stack and
look at the next token. The next token will be a <code>&quot;.&quot;</code>. This is
promising, since to make a <code>Path</code>, we have to first see an identifier
(which we did) and then zero or more <code>(&quot;.&quot; Ident)</code> pairs. So this
<code>&quot;.&quot;</code> could be the start of such a pair. So we might imagine that we should
push it on the stack and keep going, expecting to see a <code>Path</code>. Then
we&rsquo;d have a stack like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="cl">&#34;import&#34; Ident &#34;.&#34; 
</span></span></code></pre></div><p>Now, for the input <code>import foo.bar.*</code>, in fact, pushing the <code>.</code> onto
the stack <em>would</em> be the right thing to do. But for other inputs, it
would not be. Imagine that our input was <code>import foo.*;</code>. If we pushed
the <code>.</code> onto the stack, then we would eventually wind up with a stack
that looks like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="cl">&#34;import&#34; Ident &#34;.&#34; &#34;*&#34; &#34;;&#34;
</span></span></code></pre></div><p>Now we have a real problem. To a human, this is clearly an <code>ImportDecl</code>;
in particular, it matches this production:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="cl">ImportDecl = &#34;import&#34; Path &#34;.&#34; &#34;*&#34; &#34;;&#34;
</span></span></code></pre></div><p>But to the computer, this is not a match at all. The second thing
listed after <code>&quot;import&quot;</code> should be a <em>path</em> not an <em>identifier</em>. Now of
course there is a rule that lets us convert an <code>ident</code> to a path, but
it&rsquo;s too late to use it. We can only do a conversion when the thing we
are converting is the last thing we have seen. In particular here we&rsquo;d
need to ignore the last three tokens (<code>&quot;.&quot; &quot;*&quot; &quot;;&quot;</code>) and just convert
the <code>Path</code> that lies above them. The LR(1) parser is not smart enough
to do that (which is why it can parse in linear time).</p>
<p>The way I described things, this conflict arises at parse time &ndash; but
in fact the LR(1) generation algorithm can detect ahead of time that
this could happen, which is why you are getting an error.</p>
<p>So how can we solve this? The answer is that we can rearrange our
grammar.  What&rsquo;s kind of surprising about LR(1) is that seemingly
&ldquo;no-op&rdquo; rearrangements can make a big difference. <strong>This is precisely
beacuse in order for the parser to recognize a nonterminal, it must do
so at the very moment when those symbols are seen &ndash; it can&rsquo;t do it
after the fact.</strong> This has some significance to the semantics of a
grammar.  That is, normally, you can rely on the fact that your action
code will execute <strong>precisely</strong> when the tokens that you list are
seen, no later and no earlier. This may matter if your action code has
side-effects. (In the case of this grammar, we have no action code, so
there are clearly no side-effects.)</p>
<p>This also means that we can solve LR(1) conflicts by rearranging
things so that the parser doesn&rsquo;t have to make a decision as soon. So
imagine that we transformed our grammar by &ldquo;inlining&rdquo; the <code>Path</code>
nonterminal into the <code>ImportDecl</code>, and be further converting the <code>(&quot;.&quot; Ident)*</code>
entries into <code>(&quot;.&quot; Ident)+</code> (as well as another option where there are no pairs at all).
Then we would have:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">grammar</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="n">ImportDecl</span>: <span class="p">()</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="s">&#34;import&#34;</span><span class="w"> </span><span class="n">Ident</span><span class="w"> </span><span class="s">&#34;;&#34;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="s">&#34;import&#34;</span><span class="w"> </span><span class="n">Ident</span><span class="w"> </span><span class="s">&#34;.&#34;</span><span class="w"> </span><span class="s">&#34;*&#34;</span><span class="w"> </span><span class="s">&#34;;&#34;</span><span class="p">,</span><span class="w"> </span><span class="c1">// (*)
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="s">&#34;import&#34;</span><span class="w"> </span><span class="n">Ident</span><span class="w"> </span><span class="p">(</span><span class="s">&#34;.&#34;</span><span class="w"> </span><span class="n">Ident</span><span class="p">)</span><span class="o">+</span><span class="w"> </span><span class="s">&#34;;&#34;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="s">&#34;import&#34;</span><span class="w"> </span><span class="n">Ident</span><span class="w"> </span><span class="p">(</span><span class="s">&#34;.&#34;</span><span class="w"> </span><span class="n">Ident</span><span class="p">)</span><span class="o">+</span><span class="w"> </span><span class="s">&#34;.&#34;</span><span class="w"> </span><span class="s">&#34;*&#34;</span><span class="w"> </span><span class="s">&#34;;&#34;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">Ident</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="sa">r</span><span class="s">#&#34;[a-zA-Z][a-zA-Z0-9]*&#34;#</span><span class="p">;</span><span class="w">
</span></span></span></code></pre></div><p>Now, this version is equivalent to what we had before, in that it
parses the same inputs. But to the parser, it looks very different. In
particular, we no longer have to first recognize that an identifier is
a <code>Path</code> to produce an <code>ImportDecl</code>. As you can see in the second
production (indicated with a <code>(*)</code> comment) we can now directly
recognize <code>&quot;import&quot; Ident &quot;.&quot; &quot;*&quot; &quot;;&quot;</code> as an <code>ImportDecl</code>.  In other
words, the parse which got stuck before now works just fine.</p>
<p>This technique of inlining one nonterminal into another is very common
and very effective for making grammars compatible with
LR(1). Therefore, it&rsquo;s actually automated in LALRPOP. All you have to
do is annotate a nonterminal with <code>#[inline]</code> and the preprocessor
will handle it for you (moreover, the preprocessor automatically
converts <code>Foo*</code> into two options, one without <code>Foo</code> at all, and one
with <code>Foo+</code>). In fact, if we go back to the original error report, we
can see that LALRPOP recognized what was happening and even advised us
that we may want to add a <code>#[inline]</code> attribute:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="cl">  Hint: It appears you could resolve this problem by adding the annotation
</span></span><span class="line"><span class="cl">  `#[inline]` to the definition of `Path`. For more information, see the section
</span></span><span class="line"><span class="cl">  on inlining in the LALROP manual.
</span></span></code></pre></div><p>You may be wondering why LALRPOP doesn&rsquo;t just inline
automatically. There are a couple of reasons:</p>
<ol>
<li>It&rsquo;s hard to tell for sure when inlining will help. I have some
heuristics to detect some situations, but I can&rsquo;t detect them all,
and sometimes the suggestion may be inappropriate.</li>
<li>Inlining makes your grammar bigger.</li>
<li>Inlining changes when you action code runs, so it effectively alters
your program semantics.</li>
<li>Even if we could detect when to inline, it would happen relatively late
in the cycle, and so we would have to start from the beginning. By having
the user add an attribute, we know from the beginning when to inline,
and so subsequent LALRPOP instantiations are faster.</li>
</ol>
<p>Finally, inlining may just not be the best fix. For example, the
change I would <em>actually</em> make to that grammar would probably be to
convert it as follows:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">grammar</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="n">ImportDecl</span>: <span class="p">()</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="s">&#34;import&#34;</span><span class="w"> </span><span class="n">Path</span><span class="w"> </span><span class="s">&#34;;&#34;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="s">&#34;import&#34;</span><span class="w"> </span><span class="n">Path</span><span class="w"> </span><span class="s">&#34;.&#34;</span><span class="w"> </span><span class="s">&#34;*&#34;</span><span class="w"> </span><span class="s">&#34;;&#34;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">Path</span>: <span class="p">()</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Ident</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Path</span><span class="w"> </span><span class="s">&#34;.&#34;</span><span class="w"> </span><span class="n">Ident</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">};</span><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">Ident</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="sa">r</span><span class="s">#&#34;[a-zA-Z][a-zA-Z0-9]*&#34;#</span><span class="p">;</span><span class="w">
</span></span></span></code></pre></div><p>If you work it through, you will find that this grammar IS <code>LR(1)</code>,
and it doesn&rsquo;t use any inlining at all. That means it will have fewer
states.  I also find it more readable. But YMMV.</p>
<h3 id="where-to-from-here">Where to from here?</h3>
<p>First off, I really want to rework the phrasings of those error
messages. They should not (I think) talk about &ldquo;popping states&rdquo; and so
forth. But I&rsquo;ve got to spend some time thinking about how best to
explain the LR(1) algorithm. This blog post is kind of a first stab,
but it proved much harder than I expected, and I think I could
certainly make it much clearer than what I&rsquo;ve achieved thus far! :)
There are also a host of other smaller improvements that can be made.</p>
<p>All of that said, I am currently hard at work on exploring the
<a href="http://cssauh.com/xc/pub/LaneTable_APPLC12.pdf">lane table</a> generation algorithm and other variations on LR(1). This
may lead to some insights into how to present errors, I&rsquo;m not sure.
This may also lead to some ideas for how to automate inlining further,
or other scenarios where I can make tailored suggestions.  We&rsquo;ll just
have to see!</p>
<p>I&rsquo;ve got a few parsing-related blog posts I hope to write over the
next few weeks (or months, more likely):</p>
<ul>
<li>the &ldquo;ascii art&rdquo; library that I wrote to format the error messages
is itself kind of interesting;</li>
<li>how the error report generation works under the hood;</li>
<li>an explanation of the <a href="http://cssauh.com/xc/pub/LaneTable_APPLC12.pdf">lane table</a> algorithm, which is rather
underdocumented (but I&rsquo;m still figuring it out myself);</li>
<li><a href="https://github.com/nikomatsakis/rustypop">rustypop</a>, my Rust grammar in LALRPOP, is coming along, and I want
to use it as a springboard to talk about some of LALRPOP&rsquo;s macro features.</li>
</ul>
<p>So, if parsing interests you, then stay tuned.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/></entry><entry><title type="html">Parallel Iterators Part 2: Producers</title><link href="https://smallcultfollowing.com/babysteps/blog/2016/02/25/parallel-iterators-part-2-producers/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2016/02/25/parallel-iterators-part-2-producers/</id><published>2016-02-25T00:00:00+00:00</published><updated>2016-02-25T11:02:34-05:00</updated><content type="html"><![CDATA[<p>This post is the second post in my series on Rayon&rsquo;s parallel
iterators. The goal of this series is to explain how parallel
iterators are implemented internally, so I&rsquo;m going to be going over a
lot of details and giving a lot of little code examples in Rust. If
all you want to do is <em>use</em> parallel iterators, you don&rsquo;t really have
to understand any of this stuff.</p>
<p>I&rsquo;ve had a lot of fun designing this system, and I learned a few
lessons about how best to use Rust (some of which I cover in the
conclusions). I hope you enjoy reading about it!</p>
<p>This post is part 2 of a series. In the <a href="http://smallcultfollowing.com/babysteps/blog/2016/02/19/parallel-iterators-part-1-foundations/">initial post</a> I covered
sequential iterators, using this dot-product as my running example:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">vec1</span><span class="p">.</span><span class="n">iter</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">zip</span><span class="p">(</span><span class="n">vec2</span><span class="p">.</span><span class="n">iter</span><span class="p">())</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">map</span><span class="p">(</span><span class="o">|</span><span class="p">(</span><span class="n">i</span><span class="p">,</span><span class="w"> </span><span class="n">j</span><span class="p">)</span><span class="o">|</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">j</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">sum</span><span class="p">()</span><span class="w">
</span></span></span></code></pre></div><p>In this post, we are going to take a first stab at extending
sequential iterators to parallel computation, using something I call
<strong>parallel producers</strong>. At the end of the post, we&rsquo;ll have a system
that can execute that same dot-product computation, but in parallel:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">vec1</span><span class="p">.</span><span class="n">par_iter</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">zip</span><span class="p">(</span><span class="n">vec2</span><span class="p">.</span><span class="n">par_iter</span><span class="p">())</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">map</span><span class="p">(</span><span class="o">|</span><span class="p">(</span><span class="n">i</span><span class="p">,</span><span class="w"> </span><span class="n">j</span><span class="p">)</span><span class="o">|</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">j</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">sum</span><span class="p">()</span><span class="w">
</span></span></span></code></pre></div><p>Parallel producers are very cool, but they are not the end of the
story! In the next post, we&rsquo;ll cover <strong>parallel consumers</strong>, which
build on parallel producers and add support for combinators which
produce a variable number of items, like <code>filter</code> or <code>flat_map</code>.</p>
<!-- more -->
<h3 id="parallel-iteration">Parallel Iteration</h3>
<p>When I explained sequential iterators in the
<a href="http://smallcultfollowing.com/babysteps/blog/2016/02/19/parallel-iterators-part-1-foundations/">previous post</a>, I sort of did it bottom-up: I started
with how to get an iterator from a slice, then showed each combinator
we were going to use in turn (<code>zip</code>, <code>map</code>), and finally showed how
the <code>sum</code> operation at the end works.</p>
<p>To explain parallel iterators, I&rsquo;m going to work in the opposite
direction. I&rsquo;ll start with the high-level view, explaining the
<code>ParallelIterator</code> trait and how <code>sum</code> works, and then go look at how
we implement the combinators. This is because the biggest difference
in parallel iterators is actually the &ldquo;end&rdquo; operations, like <code>sum</code>,
and not as much the combinators (or at least that is true for the
combinators we&rsquo;ll cover in this post).</p>
<p>In Rayon, the <code>ParallelIterator</code> traits are divided into a hierarchy:</p>
<ul>
<li><code>ParallelIterator</code>: any sort of parallel iterator.</li>
<li><code>BoundedParallelIterator: ParallelIterator</code>: a parallel iterator that can
give an upper-bound on how many items it will produce, such as <code>filter</code>.</li>
<li><code>ExactParallelIterator: BoundedParallelIterator</code>: a parallel iterator that
knows precisely how many items will be produced.</li>
<li><code>IndexedParallelIterator: ExactParallelIterator</code>: a parallel
iterator that can produce the item for a given index <strong>without
producing all the previous items</strong>. A parallel iterator over a
vector has this propery, since you can just index into the vector.
<ul>
<li>(In this post, we&rsquo;ll be focusing on parallel iterators in this
category.  The next post will discuss how to handle things like
<code>filter</code> and <code>flat_map</code>, where the number of items being iterated
over cannot be known in advance.)</li>
</ul>
</li>
</ul>
<p>Like sequential iterators, parallel iterators represent a set of
operations to be performed (but in parallel). You can use combinators
like <code>map</code> and <code>filter</code> to build them up &ndash; doing so does not trigger
any computation, but simply produces a new, extended parallel
iterator. Finally, once you have constructed a parallel iterator that
produces the values you want, you can use various &ldquo;operation&rdquo; methods
like <code>sum</code>, <code>reduce</code>, and <code>for_each</code> to actually kick off execution.</p>
<p>This is roughly how the parallel iterator traits are defined:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">ParallelIterator</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Combinators that produce new iterators:
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">map</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="o">..</span><span class="p">.);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">filter</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="o">..</span><span class="p">.);</span><span class="w">   </span><span class="c1">// we&#39;ll be discussing these...
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">flat_map</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="o">..</span><span class="p">.);</span><span class="w"> </span><span class="c1">// ...in the next blog post
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Operations that process the items being iterated over:
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">sum</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="o">..</span><span class="p">.);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">reduce</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="o">..</span><span class="p">.);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">for_each</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="o">..</span><span class="p">.);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">BoundedParallelIterator</span>: <span class="nc">ParallelIterator</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">ExactParallelIterator</span>: <span class="nc">BoundedParallelIterator</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">len</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">usize</span><span class="p">;</span><span class="w"> </span><span class="c1">// how many items will be produced
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">IndexedParallelIterator</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Combinators:
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">zip</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="o">..</span><span class="p">.);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">enumerate</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="o">..</span><span class="p">.);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Operations:
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">collect</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="o">..</span><span class="p">.);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">collect_into</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="o">..</span><span class="p">.);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// I&#39;ll come to this one shortly :)
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">with_producer</span><span class="o">&lt;</span><span class="no">CB</span><span class="o">&gt;</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">callback</span>: <span class="nc">CB</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">where</span><span class="w"> </span><span class="no">CB</span>: <span class="nc">ProducerCallback</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>These look superficially similar to the sequential iterator traits,
but you&rsquo;ll notice some differences:</p>
<ul>
<li>Perhaps most importantly, <strong>there is no <code>next</code> method!</strong> If you
think about it, drawing the &ldquo;next&rdquo; item from an iterator is an
inherently sequential notion. Instead, parallel iterators emphasize
high-level <strong>operations</strong> like <code>sum</code>, <code>reduce</code>, <code>collect</code>, and
<code>for_each</code>, which are then automatically distributed to worker
threads.</li>
<li>Parallel iterators are much more sensitive to being indexable than
sequential ones, so some combinators like <code>zip</code> and <code>enumerate</code> are
only possible when the underlying iterator is indexed. We&rsquo;ll discuss
this in detail when covering the <code>zip</code> combinator.</li>
</ul>
<h3 id="implementing-sum-with-producers">Implementing <code>sum</code> with producers</h3>
<p>One thing you may have noticed with the <code>ParallelIterator</code> traits is
that, lacking a <code>next</code> method, there is no way to get data out of
them!  That is, we can build up a nice parallel iterator, and we can
call <code>sum</code> (or some other high-level method), but how do we
<em>implement</em> <code>sum</code>?</p>
<p>The answer lies in the <code>with_producer</code> method, which provides a way to
convert the iterator into a producer. A <em>producer</em> is kind of like a
splittable iterator: it is something that you can divide up into
little pieces and, eventually, convert into a sequential iterator to
get the data out. The trait definition looks like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Producer</span>: <span class="nb">IntoIterator</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Divide into two producers, one of which produces data
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// with indices `0..index` and the other with indices `index..`.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">split_at</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">index</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="p">(</span><span class="bp">Self</span><span class="p">,</span><span class="w"> </span><span class="bp">Self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Using producers, we can implement a parallel version of <code>sum</code> based on
a divide-and-conquer strategy. The idea is that we start out with some
producer P and a count <code>len</code> indicating how many items it will
produce.  If that count is too big, then we divide P into two
producers by calling <code>split_at</code> and then recursively sum those up (in
parallel). Otherwise, if the count is small, then we convert P into an
iterator and sum it up sequentially. We can convert to an iterator by
using the <code>into_iter</code> method from the <code>IntoIterator</code> trait, which
<code>Producer</code> extends. Here is a parallel version of <code>sum</code> that works for
any producer (as with the sequential <code>sum</code> we saw, we simplify things
by making it only work for <code>i32</code> values):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">sum_producer</span><span class="o">&lt;</span><span class="n">P</span><span class="o">&gt;</span><span class="p">(</span><span class="k">mut</span><span class="w"> </span><span class="n">producer</span>: <span class="nc">P</span><span class="p">,</span><span class="w"> </span><span class="n">len</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">i32</span>
</span></span><span class="line"><span class="cl">    <span class="nc">where</span><span class="w"> </span><span class="n">P</span>: <span class="nc">Producer</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">=</span><span class="kt">i32</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">if</span><span class="w"> </span><span class="n">len</span><span class="w"> </span><span class="o">&gt;</span><span class="w"> </span><span class="no">THRESHOLD</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// Input too large: divide it up
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">mid</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">len</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="mi">2</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="p">(</span><span class="n">left_producer</span><span class="p">,</span><span class="w"> </span><span class="n">right_producer</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">iter</span><span class="p">.</span><span class="n">split_at</span><span class="p">(</span><span class="n">mid</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="p">(</span><span class="n">left_sum</span><span class="p">,</span><span class="w"> </span><span class="n">right_sum</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">rayon</span>::<span class="n">join</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="o">||</span><span class="w"> </span><span class="n">sum_producer</span><span class="p">(</span><span class="n">left_producer</span><span class="p">,</span><span class="w"> </span><span class="n">mid</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="o">||</span><span class="w"> </span><span class="n">sum_producer</span><span class="p">(</span><span class="n">right_producer</span><span class="p">,</span><span class="w"> </span><span class="n">len</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">mid</span><span class="p">));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">left_sum</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">right_sum</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// Input too small: sum sequentially
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">sum</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mf">0.0</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">for</span><span class="w"> </span><span class="n">value</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="n">producer</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">sum</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="n">value</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">sum</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>(The actual code in Rayon most comparable to this is called
<a href="https://github.com/nikomatsakis/rayon/blob/bed0da76215aef1a0d852339fd79cedba9ec4c40/src/par_iter/internal.rs#L100-L124"><code>bridge_producer_consumer</code></a>; it uses the same basic divide-and-conquer
strategy, but it&rsquo;s generic with respect to the operation being
performed.)</p>
<h5 id="ownership-producers-and-iterators">Ownership, producers, and iterators</h5>
<p>You may be wondering why I introduced a separate <code>Producer</code> trait
rather than just adding <code>split_at</code> directly to one of the
<code>ParallelIterator</code> traits? After all, with a sequential iterator, you
just have one trait, <code>Iterator</code>, which has both &ldquo;composition&rdquo; methods
like <code>map</code> and <code>filter</code> as well as <code>next</code>.</p>
<p>The reason has to do with ownership. It is very common to have shared
resources that will be used by many threads at once during the
parallel computation and which, after the computation is done, can be
freed. We can model this easily by having those resources be <em>owned</em>
by the parallel iterator but <em>borrowed</em> by the producers, since the
producers only exist for the duration of the parallel
computation. We&rsquo;ll see an example of this later with the closure in
the <code>map</code> combinator.</p>
<h4 id="implementing-producers">Implementing producers</h4>
<p>When we looked at sequential iterators, we saw three impls: one for
slices, one for zip, and one for map. Now we&rsquo;ll look at how to
implement the <code>Producer</code> trait for each of those same three cases.</p>
<h5 id="slice-producers">Slice producers</h5>
<p>Here is the code to implement <code>Producer</code> for slices. Since slices
already support the <code>split_at</code> method, it is really very simple.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">SliceProducer</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="p">,</span><span class="w"> </span><span class="n">T</span>: <span class="na">&#39;iter</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">slice</span>: <span class="kp">&amp;</span><span class="na">&#39;iter</span> <span class="p">[</span><span class="n">T</span><span class="p">],</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Producer</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">SliceProducer</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Split-at can just piggy-back on the existing `split_at`
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// method for slices.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">split_at</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">mid</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="p">(</span><span class="bp">Self</span><span class="p">,</span><span class="w"> </span><span class="bp">Self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="p">(</span><span class="n">left</span><span class="p">,</span><span class="w"> </span><span class="n">right</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">slice</span><span class="p">.</span><span class="n">split_at</span><span class="p">(</span><span class="n">mid</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">(</span><span class="n">SliceProducer</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">slice</span>: <span class="nc">left</span><span class="w"> </span><span class="p">},</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">         </span><span class="n">SliceProducer</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">slice</span>: <span class="nc">right</span><span class="w"> </span><span class="p">})</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>We also have to implement <code>IntoIterator</code> for <code>SliceProducer</code>, so that
we can convert to sequential execution. This just builds on the slice
iterator type <code>SliceIter</code> that we saw in the <a href="http://smallcultfollowing.com/babysteps/blog/2016/02/19/parallel-iterators-part-1-foundations/">initial post</a> (in
fact, for the next two examples, I&rsquo;ll just skip over the
<code>IntoIterator</code> implementations, because they&rsquo;re really quite
straightforward):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">IntoIterator</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">SliceProducer</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;iter</span><span class="w"> </span><span class="n">T</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">IntoIter</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">SliceIter</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">into_iter</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">SliceIter</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">slice</span><span class="p">.</span><span class="n">iter</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h5 id="zip-producers">Zip producers</h5>
<p>Here is the code to implement the <code>zip</code> producer:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">ZipProducer</span><span class="o">&lt;</span><span class="n">A</span>: <span class="nc">Producer</span><span class="p">,</span><span class="w"> </span><span class="n">B</span>: <span class="nc">Producer</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">a</span>: <span class="nc">A</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">b</span>: <span class="nc">B</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">A</span><span class="p">,</span><span class="w"> </span><span class="n">B</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Producer</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">ZipProducer</span><span class="o">&lt;</span><span class="n">A</span><span class="p">,</span><span class="w"> </span><span class="n">B</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="n">A</span>: <span class="nc">Producer</span><span class="p">,</span><span class="w"> </span><span class="n">B</span>: <span class="nc">Producer</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="n">A</span>::<span class="n">Item</span><span class="p">,</span><span class="w"> </span><span class="n">B</span>::<span class="n">Item</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">split_at</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">mid</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="p">(</span><span class="bp">Self</span><span class="p">,</span><span class="w"> </span><span class="bp">Self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="p">(</span><span class="n">a_left</span><span class="p">,</span><span class="w"> </span><span class="n">a_right</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">a</span><span class="p">.</span><span class="n">split_at</span><span class="p">(</span><span class="n">mid</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="p">(</span><span class="n">b_left</span><span class="p">,</span><span class="w"> </span><span class="n">b_right</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">b</span><span class="p">.</span><span class="n">split_at</span><span class="p">(</span><span class="n">mid</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">(</span><span class="n">ZipProducer</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">a</span>: <span class="nc">a_left</span><span class="p">,</span><span class="w"> </span><span class="n">b</span>: <span class="nc">b_left</span><span class="w"> </span><span class="p">},</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">         </span><span class="n">ZipProducer</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">a</span>: <span class="nc">a_right</span><span class="p">,</span><span class="w"> </span><span class="n">b</span>: <span class="nc">b_right</span><span class="w"> </span><span class="p">})</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>What makes zip interesting is <code>split_at</code> &ndash; and I don&rsquo;t mean the code
itself, which is kind of obvious, but rather the implications of it.
In particular, if we&rsquo;re going to walk two iterators in lock-step and
we want to be able to split them into two parts, then those two parts
need to split at <strong>the same point</strong>, so that the items we&rsquo;re walking
stay lined up. This is exactly why the <code>split_at</code> method in the
<code>Producer</code> takes a precise point where to perform the split.</p>
<p>If it weren&rsquo;t for <code>zip</code>, you might imagine that instead of <code>split_at</code>
you would just have a function like <code>split</code>, where the producer gets
to pick the mid point:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">split</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="p">(</span><span class="bp">Self</span><span class="p">,</span><span class="w"> </span><span class="bp">Self</span><span class="p">);</span><span class="w">
</span></span></span></code></pre></div><p>But if we did this, then the two producers we are zipping might pick
different points to split, and we wouldn&rsquo;t get the right result.</p>
<p>The requirement that a producer be able to split itself at an
arbitrary point means that some iterator combinators cannot be
accommodated. For example, you can&rsquo;t make a producer that implements
the <code>filter</code> operation. After all, to produce the next item from a
filtered iterator, we may have to consume any number of items from the
base iterator before the filter function returns true &ndash; we just can&rsquo;t
know in advance. So we can&rsquo;t expect to split a filter into two
independent halves at any precise point. But don&rsquo;t worry: we&rsquo;ll get to
<code>filter</code> (as well as the more interesting case of <code>flat_map</code>) later on
in this blog post series.</p>
<h5 id="map-producers">Map producers</h5>
<p>Here is the type for map producers.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">MapProducer</span><span class="o">&lt;</span><span class="na">&#39;m</span><span class="p">,</span><span class="w"> </span><span class="no">PROD</span><span class="p">,</span><span class="w"> </span><span class="no">MAP_OP</span><span class="p">,</span><span class="w"> </span><span class="no">RET</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="no">PROD</span>: <span class="nc">Producer</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="no">MAP_OP</span>: <span class="nb">Fn</span><span class="p">(</span><span class="no">PROD</span>::<span class="n">Item</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">RET</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Sync</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="na">&#39;m</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">base</span>: <span class="nc">P</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">map_op</span>: <span class="kp">&amp;</span><span class="na">&#39;m</span> <span class="nc">MAP_OP</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This type definition is pretty close to the sequential case, but there
are a few crucial differences. Let&rsquo;s look at the sequential case again
for reference:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// Review: the sequential map iterator
</span></span></span><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">MapIter</span><span class="o">&lt;</span><span class="no">ITER</span><span class="p">,</span><span class="w"> </span><span class="no">MAP_OP</span><span class="p">,</span><span class="w"> </span><span class="no">RET</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="no">ITER</span>: <span class="nb">Iterator</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="no">MAP_OP</span>: <span class="nb">FnMut</span><span class="p">(</span><span class="no">ITER</span>::<span class="n">Item</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">RET</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">base</span>: <span class="nc">ITER</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">map_op</span>: <span class="nc">MAP_OP</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>All of the differences between the (parallel) producer and the
(sequential) iterator are due to the fact that the map closure is now
something that we plan to share between threads, rather than using it
only on a single thread. Let&rsquo;s go over the differences one by one to
see what I mean:</p>
<ul>
<li><code>MAP_OP</code> implements <code>Fn</code>, not <code>FnMut</code>:
<ul>
<li>The <code>FnMut</code> trait indicates a closure that receives unique,
mutable access to its environment. That makes sense in a
sequential setting, but in a parallel setting there could be many
threads executing map at once. So we switch to the <code>Fn</code> trait,
which only gives shared access to the environment. This is part of
the way that Rayon can statically prevent data races; I&rsquo;ll show
some examples of that later on.</li>
</ul>
</li>
<li><code>MAP_OP</code> must be <code>Sync</code>:
<ul>
<li><a href="http://doc.rust-lang.org/std/marker/trait.Sync.html">The <code>Sync</code> trait</a>
indicates data that can be safely shared between threads. Since we
plan to be sharing the map closure across many threads, it must be
<code>Sync</code>.</li>
</ul>
</li>
<li>the field <code>map_op</code> contains a reference <code>&amp;MAP_OP</code>:
<ul>
<li>The sequential map iterator owned the closure <code>MAP_OP</code>, but the
producer only has a shared reference. The reason for this is that
the producer needs to be something we can split into two &ndash; and
those two copies can&rsquo;t <em>both</em> own the <code>map_op</code>, they need to share
it.</li>
</ul>
</li>
</ul>
<p>Actually implementing the <code>Producer</code> trait is pretty straightforward.
It looks like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="na">&#39;m</span><span class="p">,</span><span class="w"> </span><span class="no">PROD</span><span class="p">,</span><span class="w"> </span><span class="no">MAP_OP</span><span class="p">,</span><span class="w"> </span><span class="no">RET</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Producer</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">MapProducer</span><span class="o">&lt;</span><span class="na">&#39;m</span><span class="p">,</span><span class="w"> </span><span class="no">PROD</span><span class="p">,</span><span class="w"> </span><span class="no">MAP_OP</span><span class="p">,</span><span class="w"> </span><span class="no">RET</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="no">PROD</span>: <span class="nc">Producer</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="no">MAP_OP</span>: <span class="nb">Fn</span><span class="p">(</span><span class="no">PROD</span>::<span class="n">Item</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">RET</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Sync</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="na">&#39;m</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="no">RET</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">split_at</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">mid</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="p">(</span><span class="bp">Self</span><span class="p">,</span><span class="w"> </span><span class="bp">Self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="p">(</span><span class="n">left</span><span class="p">,</span><span class="w"> </span><span class="n">right</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">base</span><span class="p">.</span><span class="n">split_at</span><span class="p">(</span><span class="n">mid</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">(</span><span class="n">MapProducer</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">base</span>: <span class="nc">left</span><span class="p">,</span><span class="w"> </span><span class="n">map_op</span>: <span class="nc">self</span><span class="p">.</span><span class="n">map_op</span><span class="w"> </span><span class="p">},</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">         </span><span class="n">MapProducer</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">base</span>: <span class="nc">right</span><span class="p">,</span><span class="w"> </span><span class="n">map_op</span>: <span class="nc">self</span><span class="p">.</span><span class="n">map_op</span><span class="w"> </span><span class="p">})</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="whence-it-all-comes">Whence it all comes</h3>
<p>At this point we&rsquo;ve seen most of how parallel iterators work:</p>
<ol>
<li>You create a parallel iterator by using the various combinator
methods and so forth.</li>
<li>When you invoke a high-level method like <code>sum</code>, <code>sum</code> will
convert the parallel iterator into a producer.</li>
<li><code>sum</code> then recursively splits this producer into sub-producers
until they represent a reasonably small (but not too small)
unit of work. Each sub-producer is processed in parallel using
<code>rayon::join</code>.</li>
<li>Eventually, <code>sum</code> converts the producer into an iterator and performs
that work sequentially.</li>
</ol>
<p>In particular, we&rsquo;ve looked in detail at the last two steps. But we&rsquo;ve
only given the first two a cursory glance. Before I finish, I want to
cover how one constructs a parallel iterator and converts it to a
producer &ndash; it seems simple, but the setup here is something that took
me a long time to get right. Let&rsquo;s look at the map combinator in
detail, because it exposes the most interesting issues.</p>
<h4 id="defining-the-parallel-iterator-type-for-map">Defining the parallel iterator type for map</h4>
<p>Let&rsquo;s start by looking at how we define and create the parallel
iterator type for map, <code>MapParIter</code>. The next section will dive into
how we convert this type into the <code>MapProducer</code> we saw before.</p>
<p>Instances of the map combinator are created when you call <code>map</code> on
some other, pre-existing parallel iterator. The <code>map</code> method
itself simply creates an instance of <code>MapParIter</code>, which wraps
up the base iterator <code>self</code> along with the mapping operation <code>map_op</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">ParallelIterator</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">map</span><span class="o">&lt;</span><span class="no">MAP_OP</span><span class="p">,</span><span class="w"> </span><span class="no">RET</span><span class="o">&gt;</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">map_op</span>: <span class="nc">MAP_OP</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                       </span>-&gt; <span class="nc">MapParIter</span><span class="o">&lt;</span><span class="bp">Self</span><span class="p">,</span><span class="w"> </span><span class="no">MAP_OP</span><span class="p">,</span><span class="w"> </span><span class="no">RET</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">where</span><span class="w"> </span><span class="no">MAP_OP</span>: <span class="nb">Fn</span><span class="p">(</span><span class="bp">Self</span>::<span class="n">Item</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">RET</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Sync</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">MapParIter</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">base</span>: <span class="nc">self</span><span class="p">,</span><span class="w"> </span><span class="n">map_op</span>: <span class="nc">map_op</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The <code>MapParIter</code> struct is defined like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">MapParIter</span><span class="o">&lt;</span><span class="no">ITER</span><span class="p">,</span><span class="w"> </span><span class="no">MAP_OP</span><span class="p">,</span><span class="w"> </span><span class="no">RET</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="no">ITER</span>: <span class="nc">ParallelIterator</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="no">MAP_OP</span>: <span class="nb">Fn</span><span class="p">(</span><span class="no">ITER</span>::<span class="n">Item</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">RET</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Sync</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">base</span>: <span class="nc">ITER</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">map_op</span>: <span class="nc">MAP_OP</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The parallel iterator struct bears a strong resemblance to the
producer struct (<code>MapProducer</code>) that we saw earlier, but there are
some important differences:</p>
<ol>
<li>The <code>base</code> is another parallel iterator of type <code>ITER</code>, not a producer.</li>
<li>The closure <code>map_op</code> is <em>owned</em> by the parallel iterator.</li>
</ol>
<p>During the time when the producer is active, the parallel iterator
will be the one that owns the shared resources (in this case, the
closure) that the various threads need to make use of. Therefore, the
iterator must outlive the entire high-level parallel operation, so
that the data that those threads are sharing remains valid.</p>
<p>Of course, we must also implement the various <code>ParallelIterator</code>
traits for <code>MapParIter</code>. For the basic <code>ParallelIterator</code> this
is straight-forward:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="no">ITER</span><span class="p">,</span><span class="w"> </span><span class="no">MAP_OP</span><span class="p">,</span><span class="w"> </span><span class="no">RET</span><span class="o">&gt;</span><span class="w"> </span><span class="n">ParallelIterator</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">MapParIter</span><span class="o">&lt;</span><span class="no">ITER</span><span class="p">,</span><span class="w"> </span><span class="no">MAP_OP</span><span class="p">,</span><span class="w"> </span><span class="no">RET</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="no">ITER</span>: <span class="nc">ParallelIterator</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="no">MAP_OP</span>: <span class="nb">Fn</span><span class="p">(</span><span class="no">ITER</span>::<span class="n">Item</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">RET</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Sync</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>When it comes to the more advanced classifications, such as
<code>BoundedParallelIterator</code> or <code>IndexedParallelIterator</code>, we can&rsquo;t say
unilaterally whether maps qualify or not. Since maps produce one item
for each item of the base iterator, they inherit their bounds from the
base producer. If the base iterator is bounded, then a mapped version
is also bounded, and so forth. We can reflect this by tweaking the
where-clauses so that instead of requiring that <code>ITER: ParallelIterator</code>, we require that <code>ITER: BoundedParallelIterator</code> and
so forth:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="no">ITER</span><span class="p">,</span><span class="w"> </span><span class="no">MAP_OP</span><span class="p">,</span><span class="w"> </span><span class="no">RET</span><span class="o">&gt;</span><span class="w"> </span><span class="n">BoundedParallelIterator</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">MapParIter</span><span class="o">&lt;</span><span class="no">ITER</span><span class="p">,</span><span class="w"> </span><span class="no">MAP_OP</span><span class="p">,</span><span class="w"> </span><span class="no">RET</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="no">ITER</span>: <span class="nc">BoundedParallelIterator</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="no">MAP_OP</span>: <span class="nb">Fn</span><span class="p">(</span><span class="no">ITER</span>::<span class="n">Item</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">RET</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Sync</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="no">ITER</span><span class="p">,</span><span class="w"> </span><span class="no">MAP_OP</span><span class="p">,</span><span class="w"> </span><span class="no">RET</span><span class="o">&gt;</span><span class="w"> </span><span class="n">IndexedParallelIterator</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">MapParIter</span><span class="o">&lt;</span><span class="no">ITER</span><span class="p">,</span><span class="w"> </span><span class="no">MAP_OP</span><span class="p">,</span><span class="w"> </span><span class="no">RET</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="no">ITER</span>: <span class="nc">IndexedParallelIterator</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="no">MAP_OP</span>: <span class="nb">Fn</span><span class="p">(</span><span class="no">ITER</span>::<span class="n">Item</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">RET</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Sync</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h4 id="converting-a-parallel-iterator-into-a-producer">Converting a parallel iterator into a producer</h4>
<p>So this brings us to the question: how do we convert a <code>MapParIter</code>
into a <code>MapProducer</code>? My first thought was to have a method like
<code>into_producer</code> as part of the <code>IndexedParallelIterator</code> trait:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// Initial, incorrect approach:
</span></span></span><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">trait</span><span class="w"> </span><span class="n">IndexedParallelIterator</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Producer</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">into_producer</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Producer</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This would then be called by the <code>sum</code> method to get a producer, which
we could pass to the <code>sum_producer</code> method we wrote
earlier. Unfortunately, while this setup is nice and simple, it
doesn&rsquo;t actually get the ownership structure right. What happens is
that ownership of the iterator passes to the <code>into_producer</code> method,
which then returns a producer &ndash; so all the resources owned by the
iterator must either be transfered to the producer, or else they will
be freed when <code>into_producer</code> returns. But it often happens that we
have shared resources that the producer just wants to borrow, so that
it can cheaply split itself without having to track ref counts or
otherwise figure out when those resources can be freed.</p>
<p>Really the problem here is that <code>into_producer</code> puts the caller in
charge of deciding how long the producer lives. What we want is a way
to get a producer that can only be used for a limited duration. The
best way to do that is with a <em>callback</em>. The idea is that instead of
calling <code>into_producer</code>, and then having a producer returned to us, we
will call <code>with_producer</code> and pass in a closure as argument. This
closure will then get called with the producer. This producer may have
borrowed references into shared state. Once the closure returns, the
parallel operation is done, and so that shared state can be freed.</p>
<p>The signature looks like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">IndexedParallelIterator</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">with_producer</span><span class="o">&lt;</span><span class="no">CB</span><span class="o">&gt;</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">callback</span>: <span class="nc">CB</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">where</span><span class="w"> </span><span class="no">CB</span>: <span class="nc">ProducerCallback</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now, if you know Rust well, you might be surprised here. I said that
<code>with_producer</code> takes a closure as argument, but typically in Rust
a closure is some type that implements one of the closure traits
(probably <code>FnOnce</code>, in this case, since we only plan to do a single
callback). Instead, I have chosen to use a custom trait, <code>ProducerCallback</code>,
defined as follows:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">ProducerCallback</span><span class="o">&lt;</span><span class="no">ITEM</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Output</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">callback</span><span class="o">&lt;</span><span class="n">P</span><span class="o">&gt;</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">producer</span>: <span class="nc">P</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Output</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">where</span><span class="w"> </span><span class="n">P</span>: <span class="nc">Producer</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">=</span><span class="no">ITEM</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Before I get into the reason to use a custom trait, let me just show
you how one would implement <code>with_producer</code> for our map iterator type
(actually, this is a simplified version, I&rsquo;ll revisit this example in
a bit to show the gory details):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">IndexedParallelIterator</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">MapParIter</span><span class="o">&lt;</span><span class="no">ITER</span><span class="p">,</span><span class="w"> </span><span class="no">MAP_OP</span><span class="p">,</span><span class="w"> </span><span class="no">RET</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="no">ITER</span>: <span class="nc">ParallelIterator</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="no">MAP_OP</span>: <span class="nb">Fn</span><span class="p">(</span><span class="no">ITER</span>::<span class="n">Item</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">RET</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Sync</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">with_producer</span><span class="o">&lt;</span><span class="no">CB</span><span class="o">&gt;</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">callback</span>: <span class="nc">CB</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">where</span><span class="w"> </span><span class="no">CB</span>: <span class="nc">ProducerCallback</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">base_producer</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="cm">/* convert base iterator into a
</span></span></span><span class="line"><span class="cl"><span class="cm">                               producer; more on this below */</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">map_producer</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">MapProducer</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">base</span>: <span class="nc">base_producer</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">map_op</span>: <span class="kp">&amp;</span><span class="nc">self</span><span class="p">.</span><span class="n">map_op</span><span class="p">,</span><span class="w"> </span><span class="c1">// borrow the map op!
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">callback</span><span class="p">.</span><span class="n">callback</span><span class="p">(</span><span class="n">map_producer</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>So why did I choose to define a <code>ProducerCallback</code> trait instead of
using <code>FnOnce</code>? The reason is that, by using a custom trait, we can
make the <code>callback</code> method <em>generic</em> over the kind of producer that
will be provided. As you can see below, the <code>callback</code> method just
says it takes some producer type <code>P</code>, but it doesn&rsquo;t get more specific
than that:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">callback</span><span class="o">&lt;</span><span class="n">P</span><span class="o">&gt;</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">producer</span>: <span class="nc">P</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">Output</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">where</span><span class="w"> </span><span class="n">P</span>: <span class="nc">Producer</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">=</span><span class="no">ITEM</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//    ^~~~~~~~~~~~~~~~~~~~~~
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// It can be called back with *any* producer type `P`.
</span></span></span></code></pre></div><p>In contrast, if I were to use a <code>FnOnce</code> trait, I would have to write
a bound that specifies the producer&rsquo;s type (even if it does so through
an associated type). For example, to use <code>FnOnce</code>, we might change the
<code>IndexedParallelIterator</code> trait as follows:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">IndexedParallelIteratorUsingFnOnce</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Producer</span>: <span class="nc">Producer</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//   ^~~~~~~~
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// The type of producer this iterator creates.
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">with_producer</span><span class="o">&lt;</span><span class="no">CB</span><span class="o">&gt;</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">callback</span>: <span class="nc">CB</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">where</span><span class="w"> </span><span class="no">CB</span>: <span class="nb">FnOnce</span><span class="p">(</span><span class="bp">Self</span>::<span class="n">Producer</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//               ^~~~~~~~~~~~~~
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">//
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// The callback can expect a producer of this type.
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>(As an aside, it&rsquo;s conceivable that we could add the ability to write
where clauses like <code>CB: for&lt;P: Producer&gt; FnOnce(P)</code>, which would be
the equivalent of the custom trait, but we don&rsquo;t have that. If you&rsquo;re
not familiar with that <code>for</code> notation, that&rsquo;s fine.)</p>
<p>You may be wondering what it is so bad about adding a <code>Producer</code>
associated type. The answer is that, in order for the <code>Producer</code> to be
able to contain borrowed references into the iterator, its type will
have to name lifetimes that are internal to the <code>with_producer</code>
method. This is because the the iterator is owned by the
<code>with_producer</code> method. But you can&rsquo;t write those lifetime names
as the value for an associated type. To see what I mean,
imagine how we would write an <code>impl</code> for our modified
<code>IndexedParallelIteratorUsingFnOnce</code> trait:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="no">ITER</span><span class="p">,</span><span class="w"> </span><span class="no">MAP_OP</span><span class="p">,</span><span class="w"> </span><span class="no">RET</span><span class="o">&gt;</span><span class="w"> </span><span class="n">IndexedParallelIteratorUsingFnOnce</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="w"> </span><span class="n">MapParIter</span><span class="o">&lt;</span><span class="no">ITER</span><span class="p">,</span><span class="w"> </span><span class="no">MAP_OP</span><span class="p">,</span><span class="w"> </span><span class="no">RET</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="no">ITER</span>: <span class="nc">IndexedParallelIteratorUsingFnOnce</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="no">MAP_OP</span>: <span class="nb">Fn</span><span class="p">(</span><span class="no">ITER</span>::<span class="n">Item</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">RET</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Sync</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Producer</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">MapProducer</span><span class="o">&lt;</span><span class="na">&#39;m</span><span class="p">,</span><span class="w"> </span><span class="no">ITER</span>::<span class="n">Producer</span><span class="p">,</span><span class="w"> </span><span class="no">MAP_OP</span><span class="p">,</span><span class="w"> </span><span class="no">RET</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//                          ^~
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">//
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Wait, what is this lifetime `&#39;m`? This is the lifetime for
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// which the `map_op` is borrowed -- but that is some lifetime
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// internal to `with_producer` (depicted below). We can&#39;t
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// name lifetimes from inside of a method from outside of that
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// method, since those names are not in scope here (and for good
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// reason: the method hasn&#39;t &#34;been called&#34; here, so it&#39;s not
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// clear what we are naming).
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">with_producer</span><span class="o">&lt;</span><span class="no">CB</span><span class="o">&gt;</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">callback</span>: <span class="nc">CB</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">where</span><span class="w"> </span><span class="no">CB</span>: <span class="nb">FnOnce</span><span class="p">(</span><span class="bp">Self</span>::<span class="n">Producer</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">base</span><span class="p">.</span><span class="n">with_producer</span><span class="p">(</span><span class="o">|</span><span class="n">base_producer</span><span class="o">|</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="kd">let</span><span class="w"> </span><span class="n">map_producer</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">MapProducer</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="c1">// +----+ &#39;m
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="n">base</span>: <span class="nc">base_producer</span><span class="p">,</span><span class="w">         </span><span class="c1">//      |
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="n">map_op</span>: <span class="kp">&amp;</span><span class="nc">self</span><span class="p">.</span><span class="n">map_op</span><span class="p">,</span><span class="w">        </span><span class="c1">//      |
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">};</span><span class="w">                               </span><span class="c1">//      |
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">callback</span><span class="p">(</span><span class="n">map_producer</span><span class="p">);</span><span class="w">          </span><span class="c1">//      |
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">})</span><span class="w">                                   </span><span class="c1">// &lt;----+
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Using the generic <code>ProducerCallback</code> trait totally solves this
problem, but it does mean that writing code which calls
<code>with_producer</code> is kind of awkward. This is because we can&rsquo;t take
advantage of Rust&rsquo;s builtin closure notation, as I was able to do in
the previous, incorrect example. This means we have to &ldquo;desugar&rdquo; the
closure manually, creating a struct that will store our environment.
So if we want to see the full gory details, implementing
<code>with_producer</code> for the map combinator looks like this (btw, here is
the <a href="https://github.com/nikomatsakis/rayon/blob/312fc8ccd7a28289138d2b0d3ce16dfec6269b04/src/par_iter/map.rs#L59-L90">actual code</a> from Rayon):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">IndexedParallelIterator</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">MapParIter</span><span class="o">&lt;</span><span class="no">ITER</span><span class="p">,</span><span class="w"> </span><span class="no">MAP_OP</span><span class="p">,</span><span class="w"> </span><span class="no">RET</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="no">ITER</span>: <span class="nc">ParallelIterator</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="no">MAP_OP</span>: <span class="nb">Fn</span><span class="p">(</span><span class="no">ITER</span>::<span class="n">Item</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">RET</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Sync</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">with_producer</span><span class="o">&lt;</span><span class="no">CB</span><span class="o">&gt;</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">callback</span>: <span class="nc">CB</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">where</span><span class="w"> </span><span class="no">CB</span>: <span class="nc">ProducerCallback</span><span class="o">&lt;</span><span class="no">RET</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">my_callback</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">MyCallback</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="c1">// defined below
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">callback</span>: <span class="nc">callback</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">map_op</span>: <span class="kp">&amp;</span><span class="nc">self</span><span class="p">.</span><span class="n">map_op</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">base</span><span class="p">.</span><span class="n">with_producer</span><span class="p">(</span><span class="n">my_callback</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">struct</span> <span class="nc">MyCallback</span><span class="o">&lt;</span><span class="na">&#39;m</span><span class="p">,</span><span class="w"> </span><span class="no">MAP_OP</span><span class="p">,</span><span class="w"> </span><span class="no">CB</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//          ^~
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">//
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// This is that same lifetime `&#39;m` we had trouble with
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// in the previous example: but now it only has to be
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// named from *inside* `with_producer`, so we have no
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// problems.
</span></span></span><span class="line"><span class="cl"><span class="w">            
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">callback</span>: <span class="nc">CB</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">map_op</span>: <span class="kp">&amp;</span><span class="na">&#39;m</span> <span class="nc">MAP_OP</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">impl</span><span class="o">&lt;</span><span class="na">&#39;m</span><span class="p">,</span><span class="w"> </span><span class="no">ITEM</span><span class="p">,</span><span class="w"> </span><span class="no">MAP_OP</span><span class="p">,</span><span class="w"> </span><span class="no">CB</span><span class="o">&gt;</span><span class="w"> </span><span class="n">ProducerCallback</span><span class="o">&lt;</span><span class="no">ITEM</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">MyCallback</span><span class="o">&lt;</span><span class="na">&#39;m</span><span class="p">,</span><span class="w"> </span><span class="no">MAP_OP</span><span class="p">,</span><span class="w"> </span><span class="no">CB</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">where</span><span class="w"> </span><span class="cm">/* omitted for &#34;brevity&#34; :) */</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">type</span> <span class="nc">Output</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">();</span><span class="w"> </span><span class="c1">// return type of `callback`
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// The method that `self.base` will call with the
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// base producer:
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">fn</span> <span class="nf">callback</span><span class="o">&lt;</span><span class="n">P</span><span class="o">&gt;</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">base_producer</span>: <span class="nc">P</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="k">where</span><span class="w"> </span><span class="n">P</span>: <span class="nc">Producer</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">=</span><span class="no">ITEM</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="c1">// Wrap the base producer in a MapProducer.
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="kd">let</span><span class="w"> </span><span class="n">map_producer</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">MapProducer</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                   </span><span class="n">base</span>: <span class="nc">base_producer</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                   </span><span class="n">map_op</span>: <span class="nc">self</span><span class="p">.</span><span class="n">map_op</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="p">};</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="c1">// Finally, callback the original callback,
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="c1">// giving them out `map_producer`.
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="bp">self</span><span class="p">.</span><span class="n">callback</span><span class="p">.</span><span class="n">callback</span><span class="p">(</span><span class="n">map_producer</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="conclusions">Conclusions</h3>
<p>OK, whew! We&rsquo;ve now covered <strong>parallel producers</strong> from start to
finish. The design you see here did not emerge fully formed: it is the
result of a lot of iteration. This design has some nice features, many
of which are shared with sequential iterators:</p>
<ul>
<li><strong>Efficient fallback to sequential processing.</strong> If you are
processing a small amount of data, we will never bother with
&ldquo;splitting&rdquo; the producer, and we&rsquo;ll just fallback to using the same
old sequential iterators you were using before, so you should have
very little performance loss. When processing larger amounts of
data, we will divide into threads &ndash; which you want &ndash; but when the
chunks get small enough, we&rsquo;ll use the same sequential processing to
handle the leaves.</li>
<li><strong>Lazy, no allocation, etc.</strong> You&rsquo;ll note that nowhere in any of the
above code did we do any allocation or eager computation.</li>
<li><strong>Straightforward, no unsafe code.</strong> Something else that you didn&rsquo;t
see in this blog post: unsafe code. All the unsafety is packaged up
in Rayon&rsquo;s join method, and most of the parallel iterator code just
leverages that. Overall, apart from the manual closure &ldquo;desugaring&rdquo;
in the last section, writing producers is really pretty
straightforward.</li>
</ul>
<h4 id="things-i-learned">Things I learned</h4>
<p>My last point above &ndash; that writing producers is fairly
straightforward &ndash; was certainly not always the case: the initial
designs required a lot of more &ldquo;stuff&rdquo; &ndash; phantom types, crazy
lifetimes, etc. But I found that these are often signs that your
traits could be adjusted to make things go more smoothly. Some of the
primary lessons follow.</p>
<p><strong>Align input/output type parameters on traits to go with dataflow.</strong>
One of the biggest sources of problems for me was that I was overusing
associated types, which wound up requiring a lot of phantom types and
other things. At least in these cases, what worked well as a rule of
thumb was this: if data is &ldquo;flowing in&rdquo; to the trait, it should be an
input type parameter. It data is &ldquo;flowing out&rdquo;, it should be an
associated type. So, for example, producers have an associated type
<code>Item</code>, which indicates the kind of data a <code>Producer</code> or iterator will
produce, is an associated type. But the <code>ProducerCallback&lt;T&gt;</code> trait is
parameteried over <code>T</code>, the type of that the base producer will create.</p>
<p><strong>Choose RAII vs callbacks based on who needs control.</strong> When
designing APIs, we often tend to prefer RAII over callbacks. The
immediate reason is often superficial: callbacks lead to rightward
drift. But there is also a deeper reason: RAII can be more flexible.</p>
<p>Effectively, whether you use the RAII pattern or a callback, there is
always some kind of dynamic &ldquo;scope&rdquo; associated with the thing you are
doing.  If you are using a callback, that scope is quite explicit: you
will invoke the callback, and the scope corresponds to the time while
that callback is executing. Once the callback returns, the scope is
over, and you are back in control.</p>
<p>With RAII, the scope is open-ended. You are returning a value to your
caller that has a destructor &ndash; this means that the scope lasts until
your caller chooses to dispose of that value, which may well be
<strong>never</strong> (particularly since they could leak it). That is why I say
RAII is more flexible: it gives the caller control over the scope of
the operation.  Concretely, this means that the caller can return the
RAII value up to their caller, store it in a hashmap, whatever.</p>
<p>But that control also comes at a cost to you. For example, if you have
resources that have to live for the entire &ldquo;scope&rdquo; of the operation
you are performing, and you are using a callback, you can easily
leverage the stack to achieve this. Those resources just live on your
stack frame &ndash; and so naturally they are live when you call the
callback, and remain live until the callback returns. But if you are
using RAII, you have to push ownership of those resources into the
value that you will return. This in turn can make borrowing and
sharing harder.</p>
<p>So, in short, if you can align the scopes of your program with
callbacks and the stack frame, everthing works out more easily, but
you lose some flexibility on the part of your callers (and you incur
some rightward drift). Whether that is ok will depend on the context
&ndash; in the case of Rayon, it&rsquo;s perfectly fine. The real user is just
calling <code>sum</code>, and they have to block until <code>sum</code> returns anyway to
get the result. So it&rsquo;s no problem if <code>sum</code> internally uses a callback
to phase the parallel operation. But in other contexts the
requirements may be different.</p>
<h4 id="whats-to-come">What&rsquo;s to come</h4>
<p>I plan to write up a third blog post, about parallel consumers, in the
not too distant future. But I might take a break for a bit, because I
have a bunch of other half-finished posts I want to write up, covering
topics like specialization, the borrow checker, and a nascent grammar
for Rust using LALRPOP.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/rayon" term="rayon" label="Rayon"/></entry><entry><title type="html">Parallel Iterators Part 1: Foundations</title><link href="https://smallcultfollowing.com/babysteps/blog/2016/02/19/parallel-iterators-part-1-foundations/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2016/02/19/parallel-iterators-part-1-foundations/</id><published>2016-02-19T00:00:00+00:00</published><updated>2016-02-19T06:32:44-05:00</updated><content type="html"><![CDATA[<p>Since <a href="https://air.mozilla.org/bay-area-rust-meetup-january-2016/">giving a talk about Rayon at the Bay Area Rust meetup</a>,
I&rsquo;ve been working off and on on the support for <em>parallel
iterators</em>. The basic idea of a parallel iterator is that I should be
able to take an existing iterator chain, which operates sequentially,
and easily convert it to work in parallel. As a simple example,
consider this bit of code that computes the dot-product of two
vectors:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">vec1</span><span class="p">.</span><span class="n">iter</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">zip</span><span class="p">(</span><span class="n">vec2</span><span class="p">.</span><span class="n">iter</span><span class="p">())</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">map</span><span class="p">(</span><span class="o">|</span><span class="p">(</span><span class="n">i</span><span class="p">,</span><span class="w"> </span><span class="n">j</span><span class="p">)</span><span class="o">|</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">j</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">sum</span><span class="p">()</span><span class="w">
</span></span></span></code></pre></div><p>Using parallel iterators, all I have to do to make this run in
parallel is change the <code>iter</code> calls into <code>par_iter</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">vec1</span><span class="p">.</span><span class="n">par_iter</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">zip</span><span class="p">(</span><span class="n">vec2</span><span class="p">.</span><span class="n">par_iter</span><span class="p">())</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">map</span><span class="p">(</span><span class="o">|</span><span class="p">(</span><span class="n">i</span><span class="p">,</span><span class="w"> </span><span class="n">j</span><span class="p">)</span><span class="o">|</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">j</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">sum</span><span class="p">()</span><span class="w">
</span></span></span></code></pre></div><p>This new iterator chain is now using Rayon&rsquo;s parallel iterators
instead of the standard Rust ones. Of course, implementing this simple
idea turns out to be rather complicated in practice. I&rsquo;ve had to
iterate on the design many times as I tried to add new combinators. I
wanted to document the design, but it&rsquo;s too much for just one blog
post. Therefore, I&rsquo;m writing up a little series of blog posts that
cover the design in pieces:</p>
<ul>
<li><strong>This post: sequential iterators.</strong> I realized while writing the
other two posts that it would make sense to first describe
sequential iterators in detail, so that I could better highlight
where parallel iterators differ. This post therefore covers the
iterator chain above and shows how it is implemented.</li>
<li>Next post: parallel producers.</li>
<li>Final post: parallel consumers.</li>
</ul>
<!-- more -->
<h4 id="review-sequential-iterators">Review: sequential iterators</h4>
<p>Before we get to parallel iterators, let&rsquo;s start by covering how
Rust&rsquo;s <em>sequential</em> iterators work. The basic idea is that iterators
are lazy, in the sense that constructing an iterator chain does not
actually <em>do</em> anything until you &ldquo;execute&rdquo; that iterator, either with
a <code>for</code> loop or with a method like <code>sum</code>. In the example above, that
means that the chain <code>vec1.iter().zip(...).map(...)</code> are all
operations that just build up a iterator, without actually <em>doing</em>
anything. Only when we call <code>sum</code> do we start actually doing work.</p>
<p>In sequential iterators, the key to this is
<a href="http://doc.rust-lang.org/std/iter/trait.Iterator.html">the <code>Iterator</code> trait</a>.  This trait is actually very simple; it
basically contains two members of interest:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w"> </span><span class="c1">// The type of item we will produce
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="p">;</span><span class="w"> </span><span class="c1">// Request the next item
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The idea is that, for each collection, we have a method that will
return some kind of iterator type which implements this <code>Iterator</code>
trait. So let&rsquo;s walk through all the pieces of our example iterator
chain one by one (I&rsquo;ve highlighted the steps in comments below):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">vec1</span><span class="p">.</span><span class="n">iter</span><span class="p">()</span><span class="w">              </span><span class="c1">// Slice iterator (over `vec1`)
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">zip</span><span class="p">(</span><span class="n">vec2</span><span class="p">.</span><span class="n">iter</span><span class="p">())</span><span class="w">    </span><span class="c1">// Zip iterator (over two slice iterators)
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">map</span><span class="p">(</span><span class="o">|</span><span class="p">(</span><span class="n">i</span><span class="p">,</span><span class="w"> </span><span class="n">j</span><span class="p">)</span><span class="o">|</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">j</span><span class="p">)</span><span class="w"> </span><span class="c1">// Map iterator
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">sum</span><span class="p">()</span><span class="w">               </span><span class="c1">// Sum executor
</span></span></span></code></pre></div><h5 id="slice-iterators">Slice iterators</h5>
<p>The very start of our iterator chain was a call <code>vec1.iter()</code>. Here
<code>vec1</code> is a slice of integers, so it has a type like <code>&amp;[i32]</code>. (A
<em>slice</em> is a subportion of a vector or array.) But the <code>iter()</code> method
(and the iterator it returns) is defined generically for slices of any
type <code>T</code>. The method looks something like this (because this method
applies to all slices in every crate, you can only write an impl like
this in the standard library):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">]</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">iter</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">SliceIter</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">SliceIter</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">slice</span>: <span class="nc">self</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>It creates and returns a value of the struct <code>SliceIter</code>, which is the
type of the slice iterator (in the standard library, this type is
<a href="http://doc.rust-lang.org/std/iter/trait.Iterator.html"><code>std::slice::Iter</code></a>, though it&rsquo;s implemented somewhat
differently). The definition of <code>SliceIter</code> looks something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">SliceIter</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="p">,</span><span class="w"> </span><span class="n">T</span>: <span class="na">&#39;iter</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">slice</span>: <span class="kp">&amp;</span><span class="na">&#39;iter</span> <span class="p">[</span><span class="n">T</span><span class="p">],</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The <code>SliceIter</code> type has only one field, <code>slice</code>, which stores the
slice we are iterating over. Each time we produce a new item, we will
update this field to contain a subslice with just the remaining items.</p>
<p>If you&rsquo;re wondering what the <code>'iter</code> notation means, it represents the
<em>lifetime</em> of the slice, meaning the span of the code where that
reference is in use. In general, references can be elided within
function signatures and bodies, but they must be made explicit in type
definitions. In any case, without going into too much detail here, the
net effect of this annotation is to ensure that the iterator does not
outlive the slice that it is iterating over.</p>
<p>Now, to use <code>SliceIter</code> as an iterator, we must implement the
<code>Iterator</code> trait. We want to yield up a reference <code>&amp;T</code> to each item in
the slice in turn. The idea is that each time we call <code>next</code>, we will
peel off a reference to the first item in <code>self.slice</code>, and then
adjust <code>self.slice</code> to contain only the remaining items. That looks
something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">SliceIter</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Each round, we will yield up a reference to `T`. This reference
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// is valid for as long as the iterator is valid.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;iter</span><span class="w"> </span><span class="n">T</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;&amp;</span><span class="na">&#39;iter</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// `split_first` gives us the first item (`head`) and
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// a slice with the remaining items (`tail`),
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// returning None if the slice is empty.
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">((</span><span class="n">head</span><span class="p">,</span><span class="w"> </span><span class="n">tail</span><span class="p">))</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">slice</span><span class="p">.</span><span class="n">split_first</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="bp">self</span><span class="p">.</span><span class="n">slice</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">tail</span><span class="p">;</span><span class="w"> </span><span class="c1">// update slice w/ the remaining items
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nb">Some</span><span class="p">(</span><span class="n">head</span><span class="p">)</span><span class="w"> </span><span class="c1">// return the first item
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nb">None</span><span class="w"> </span><span class="c1">// no more items to yield up
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h5 id="zip-iterators">Zip iterators</h5>
<p>Ok, so let&rsquo;s return to our example iterator chain:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">vec1</span><span class="p">.</span><span class="n">iter</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">zip</span><span class="p">(</span><span class="n">vec2</span><span class="p">.</span><span class="n">iter</span><span class="p">())</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">map</span><span class="p">(</span><span class="o">|</span><span class="p">(</span><span class="n">i</span><span class="p">,</span><span class="w"> </span><span class="n">j</span><span class="p">)</span><span class="o">|</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">j</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">sum</span><span class="p">()</span><span class="w">
</span></span></span></code></pre></div><p>We&rsquo;ve now seen how <code>vec1.iter()</code> and <code>vec2.iter()</code> work, but what
about <code>zip</code>? The <a href="http://doc.rust-lang.org/std/iter/trait.Iterator.html#method.zip">zip iterator</a> is an adapter that takes two
other iterators and walks over them in lockstep. The return type
of <code>zip</code> then is going to be a type <code>ZipIter</code> that just stores
two other iterators:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">ZipIter</span><span class="o">&lt;</span><span class="n">A</span>: <span class="nb">Iterator</span><span class="p">,</span><span class="w"> </span><span class="n">B</span>: <span class="nb">Iterator</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">a</span>: <span class="nc">A</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">b</span>: <span class="nc">B</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Here the generic types <code>A</code> and <code>B</code> represent the types of the
iterators being zipped up.  Each iterator chain has its own type that
determines exactly how it works. In this example we are going to zip
up two slice iterators, so the full type of our zip iterator will be
<code>ZipIter&lt;SliceIter&lt;'a, i32&gt;, SliceIter&lt;'b, i32&gt;&gt;</code> (but we never have
to write that down, it&rsquo;s all fully inferred by the compiler).</p>
<p>When implementing the <code>Iterator</code> trait for <code>ZipIter</code>, we just want the
<code>next</code> method to draw the next item from <code>a</code> and <code>b</code> and pair them up,
stopping when either is empty:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">A</span>: <span class="nb">Iterator</span><span class="p">,</span><span class="w"> </span><span class="n">B</span>: <span class="nb">Iterator</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">ZipIter</span><span class="o">&lt;</span><span class="n">A</span><span class="p">,</span><span class="n">B</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="n">A</span>::<span class="n">Item</span><span class="p">,</span><span class="w"> </span><span class="n">B</span>::<span class="n">Item</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="p">(</span><span class="n">A</span>::<span class="n">Item</span><span class="p">,</span><span class="w"> </span><span class="n">B</span>::<span class="n">Item</span><span class="p">)</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">a_item</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">a</span><span class="p">.</span><span class="n">next</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="k">if</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">b_item</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">b</span><span class="p">.</span><span class="n">next</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="c1">// If both iterators have another item to
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="c1">// give, pair them up and return it to
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="c1">// the user.
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="k">return</span><span class="w"> </span><span class="nb">Some</span><span class="p">((</span><span class="n">a_item</span><span class="p">,</span><span class="w"> </span><span class="n">b_item</span><span class="p">));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nb">None</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h5 id="map-iterators">Map iterators</h5>
<p>The next step in our example iterator chain is the call to <code>map</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">vec1</span><span class="p">.</span><span class="n">iter</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">zip</span><span class="p">(</span><span class="n">vec2</span><span class="p">.</span><span class="n">iter</span><span class="p">())</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">map</span><span class="p">(</span><span class="o">|</span><span class="p">(</span><span class="n">i</span><span class="p">,</span><span class="w"> </span><span class="n">j</span><span class="p">)</span><span class="o">|</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">j</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">sum</span><span class="p">()</span><span class="w">
</span></span></span></code></pre></div><p>Map is another iterator adapter, this time one that applies a function
to each item we are iterating, and then yields the result of that
function call. The <code>MapIter</code> type winds up with three generic types:</p>
<ul>
<li><code>ITER</code>, the type of the base iterator;</li>
<li><code>MAP_OP</code>, the type of the closure that we will apply at each step (in
Rust, closures each have their own unique type);</li>
<li><code>RET</code>, the return type of that closure, which will be the type of the
items that we yield on each step.</li>
</ul>
<p>The definition looks like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">MapIter</span><span class="o">&lt;</span><span class="no">ITER</span><span class="p">,</span><span class="w"> </span><span class="no">MAP_OP</span><span class="p">,</span><span class="w"> </span><span class="no">RET</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="no">ITER</span>: <span class="nb">Iterator</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="no">MAP_OP</span>: <span class="nb">FnMut</span><span class="p">(</span><span class="no">ITER</span>::<span class="n">Item</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">RET</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">base</span>: <span class="nc">ITER</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">map_op</span>: <span class="nc">MAP_OP</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>(As an aside, here I&rsquo;ve switched to using a where clause to write out
the constraints on the various parameters. This is just a stylistic
choice: I find it easier to read if they are separated out.)</p>
<p>In any case, I want to focus on the second where clause for a second:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">where</span><span class="w"> </span><span class="no">MAP_OP</span>: <span class="nb">FnMut</span><span class="p">(</span><span class="no">ITER</span>::<span class="n">Item</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">RET</span><span class="w">
</span></span></span></code></pre></div><p>There&rsquo;s a lot packed in here. First, we said that <code>MAP_OP</code> was the
type of the closure that we are going to be mapping over: <code>FnMut</code> is
<a href="http://doc.rust-lang.org/std/ops/trait.FnMut.html">one of Rust&rsquo;s standard closure traits</a>;
it indicates a function that will be called repeatedly in a sequential
fashion (notice I said <em>sequential</em>; we&rsquo;ll have to adjust this later
when we want to generalize to parallel execution). It&rsquo;s called <code>FnMut</code>
because it takes an <code>&amp;mut self</code> reference to its environment, and thus
it can mutate data from the enclosing scope.</p>
<p>The where clause also indicates the argument and return type of the
closure. <code>MAP_OP</code> will take one argument, <code>ITER::Item</code> &ndash; this it the
type of item that our base iterator produces &ndash; and it will return
values of type <code>RET</code>.</p>
<p>OK, now let&rsquo;s write the iterator itself:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="no">ITER</span><span class="p">,</span><span class="w"> </span><span class="no">MAP_OP</span><span class="p">,</span><span class="w"> </span><span class="no">RET</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">MapIter</span><span class="o">&lt;</span><span class="no">ITER</span><span class="p">,</span><span class="w"> </span><span class="no">MAP_OP</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="no">ITER</span>: <span class="nb">Iterator</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="no">MAP_OP</span>: <span class="nb">FnMut</span><span class="p">(</span><span class="n">P</span>::<span class="n">Item</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">RET</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// We yield up whatever type `MAP_OP` returns:
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="no">RET</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="no">RET</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">match</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">base</span><span class="p">.</span><span class="n">next</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// No more items in base iterator:
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nb">None</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="nb">None</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="c1">// If there is an item...
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nb">Some</span><span class="p">(</span><span class="n">item</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="c1">// ...apply `map_op` and return the result:
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="nb">Some</span><span class="p">((</span><span class="bp">self</span><span class="p">.</span><span class="n">map_op</span><span class="p">)(</span><span class="n">item</span><span class="p">))</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h5 id="pulling-it-all-together-the-sum-operation">Pulling it all together: the sum operation</h5>
<p>The final step is the actual summation. This turns out to be fairly
straightforward. The <a href="http://doc.rust-lang.org/std/iter/trait.Iterator.html#method.sum">actual <code>sum</code> method</a> is designed to work over any
kind of type that can be added in a generic way, but in the interest
of simplicity, let me just give you a version of <code>sum</code> that works on
integers (I&rsquo;ll also write it as a free-function rather than a method):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">sum</span><span class="o">&lt;</span><span class="no">ITER</span>: <span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">=</span><span class="kt">i32</span><span class="o">&gt;&gt;</span><span class="p">(</span><span class="n">iter</span>: <span class="nc">ITER</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">i32</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">result</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">while</span><span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="n">v</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">iter</span><span class="p">.</span><span class="n">next</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">result</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="n">v</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">result</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Here we take in some iterator of type <code>ITER</code>. We don&rsquo;t care what kind
of iterator it is, but it must produce integers, which is what the
<code>Iterator&lt;Item=i32&gt;</code> bound means. Next we repeatedly call <code>next</code> to
draw all the items out of the iterator; at each step, we add them up.</p>
<h5 id="one-last-little-detail">One last little detail</h5>
<p>There is one last piece of the iterator puzzle that I would like to
cover, because I make use of it in the parallel iterator design. In my
example, I created iterators explicitly by calling <code>iter</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">vec1</span><span class="p">.</span><span class="n">iter</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">zip</span><span class="p">(</span><span class="n">vec2</span><span class="p">.</span><span class="n">iter</span><span class="p">())</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">map</span><span class="p">(</span><span class="o">|</span><span class="p">(</span><span class="n">i</span><span class="p">,</span><span class="w"> </span><span class="n">j</span><span class="p">)</span><span class="o">|</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">j</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">sum</span><span class="p">()</span><span class="w">
</span></span></span></code></pre></div><p>But you may have noticed that in idiomatic Rust code, this explicit call to
<code>iter</code> can sometimes be elided. For example, if I were actually writing
that iterator chain, I wouldn&rsquo;t call <code>iter()</code> from within the call to <code>zip</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">vec1</span><span class="p">.</span><span class="n">iter</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">zip</span><span class="p">(</span><span class="n">vec2</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">map</span><span class="p">(</span><span class="o">|</span><span class="p">(</span><span class="n">i</span><span class="p">,</span><span class="w"> </span><span class="n">j</span><span class="p">)</span><span class="o">|</span><span class="w"> </span><span class="n">i</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">j</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">.</span><span class="n">sum</span><span class="p">()</span><span class="w">
</span></span></span></code></pre></div><p>Similarly, if you are writing a simple for loop that just goes over a
container or slice, you can often elide the call to <code>iter</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">for</span><span class="w"> </span><span class="n">item</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="n">vec2</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">process</span><span class="p">(</span><span class="n">item</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>So what is going on here? The answer is that we have another trait
called <code>IntoIterator</code>, which defines what types can be converted
into iterators:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="nb">IntoIterator</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// the type of item our iterator will produce
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// the iterator type we will become
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">IntoIter</span>: <span class="nb">Iterator</span><span class="o">&lt;</span><span class="n">Item</span><span class="o">=</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// convert this value into an iterator
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">into_iter</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Self</span>::<span class="n">IntoIter</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Naturally, anything which is itself an iterator implements
<code>IntoIterator</code> automatically &ndash; it just gets &ldquo;converted&rdquo; into itself,
since it is already an iterator. Container types also implement
<code>IntoIterator</code>. The usual convention is that the container type itself
implements <code>IntoIterator</code> so as to give ownership of its contents:
e.g., converting <code>Vec&lt;T&gt;</code> into an iterator takes ownership of the
vector and gives back an iterator yielding ownership of its <code>T</code>
elements.  However, converting a <em>reference</em> to a vector (e.g.,
``&amp;Vec<T><code>) gives back *references* to the elements </code>&amp;T<code>. Similarly, converting a borrowed slice like </code>&amp;[T]<code> into an iterator also gives back references to the elements (</code>&amp;T<code>). We can implement </code>IntoIterator<code>for</code>&amp;[T]` like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">IntoIterator</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;iter</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">]</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// as we saw before, iterating over a slice gives back references
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// to the items within
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;iter</span><span class="w"> </span><span class="n">T</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// the iterator type we defined earlier
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">IntoIter</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">SliceIter</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">into_iter</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">SliceIter</span><span class="o">&lt;</span><span class="na">&#39;iter</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">iter</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Finally, the <code>zip</code> helper method uses <code>IntoIterator</code> to convert its
argument into an iterator:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">zip</span><span class="o">&lt;</span><span class="n">B</span><span class="o">&gt;</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">other</span>: <span class="nc">B</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">ZipIter</span><span class="o">&lt;</span><span class="bp">Self</span><span class="p">,</span><span class="w"> </span><span class="n">B</span>::<span class="n">IntoIter</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">where</span><span class="w"> </span><span class="n">B</span>: <span class="nb">IntoIterator</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">ZipIter</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">a</span>: <span class="nc">self</span><span class="p">,</span><span class="w"> </span><span class="n">b</span>: <span class="nc">other</span><span class="p">.</span><span class="n">into_iter</span><span class="p">()</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h5 id="taking-a-step-back">Taking a step back</h5>
<p>Now that we&rsquo;ve covered the whole iterator chain, let&rsquo;s take a moment
to reflect on some interesting properties of this whole setup. First,
notice that as we create our iterator chain, nothing actually
<em>happens</em> until we call <code>sum</code>. That is, you might expect that calling
<code>vec1.iter().zip(vec2.iter())</code> would go and allocate a new vector that
contains pairs from both slices, but, as we&rsquo;ve seen, it does not. It
just creates a <code>ZipIter</code> that holds references to both slices. In
fact, no vector of pairs is <em>ever</em> created (unless you ask for one by
calling <code>collect</code>). Thus iteration can be described as <em>lazy</em>, since
the various effects described by an iterator take place at the last
possible time.</p>
<p>The other neat thing is that while all of this code looks very
abstract, it actually optimizes to something very efficient. This is a
side effect of all those generic types that we saw before. They
basically ensure that the resulting iterator has a type that describes
<em>precisely</em> what it is going to do. The compiler will then generate a
custom copy of each iterator function tailored to that particular
type. So, for example, we wind up with a custom copy of <code>ZipIter</code> that
is specific to iterating over slices, and a custom copy of <code>MapIter</code>
that is specific to multiplying the results of that particular
<code>ZipIter</code>. These copies can then be optimized independently. The end
result is that our dot-product iteration chain winds up being
optimized into some very tight assembly; in fact, it even gets
vectorized. You can verify this yourself by
<a href="http://is.gd/auN5SL">looking at this example on play</a> and clicking
the &ldquo;ASM&rdquo; button (but don&rsquo;t forget to select &ldquo;Release&rdquo; mode). Here is
the inner loop you will see:</p>
<pre tabindex="0"><code>.LBB0_8:
	movdqu	(%rdi,%rbx,4), %xmm1
	movdqu	(%rdx,%rbx,4), %xmm2
	pshufd	$245, %xmm2, %xmm3
	pmuludq	%xmm1, %xmm2
	pshufd	$232, %xmm2, %xmm2
	pshufd	$245, %xmm1, %xmm1
	pmuludq	%xmm3, %xmm1
	pshufd	$232, %xmm1, %xmm1
	punpckldq	%xmm1, %xmm2
	paddd	%xmm2, %xmm0
	addq	$4, %rbx
	incq	%rax
	jne	.LBB0_8
</code></pre><p>Neat.</p>
<h3 id="recap">Recap</h3>
<p>So let&rsquo;s review the criticial points of sequential iterators:</p>
<ul>
<li>They are <strong>lazy</strong>. No work is done until you call <code>next</code>, and then the iterator
does the minimal amount of work it can to produce a result.</li>
<li>They <strong>do not allocate</strong> (unless you ask them to). None of the code
we wrote here requires allocating any memory or builds up any
intermediate data structures. Of course, if you use an operation
like <code>collect</code>, which accumulates the iterator&rsquo;s items into a vector
or other data structure, building that data structure will require
allocating memory.</li>
<li>They are <strong>generic and highly optimizable</strong>. Each iterator
combinator uses generic type parameters to represent the types of
the prior iterator that it builds on, as well as any closures that
it references. This means that the compiler will make a custom copy
of the iterator specialized to that particular task, which is very
amenable to optimization.
<ul>
<li>This is in sharp contrast to iterators in languages like Java,
which are based on virtual dispatch and generic interfaces.  The
design is similar, but the resulting code is very different.</li>
</ul>
</li>
</ul>
<p>So in summary, you get to write really <strong>high-level, convenient</strong> code
with really <strong>low-level, efficient</strong> performance.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/><category scheme="https://smallcultfollowing.com/babysteps/categories/rayon" term="rayon" label="Rayon"/></entry><entry><title type="html">Rayon: data parallelism in Rust</title><link href="https://smallcultfollowing.com/babysteps/blog/2015/12/18/rayon-data-parallelism-in-rust/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2015/12/18/rayon-data-parallelism-in-rust/</id><published>2015-12-18T00:00:00+00:00</published><updated>2015-12-18T09:52:00-05:00</updated><content type="html"><![CDATA[<p>Over the last week or so, I&rsquo;ve been working on an update to
<a href="https://github.com/nikomatsakis/rayon/">Rayon</a>, my experimental library for <strong>data parallelism</strong> in
Rust. I&rsquo;m pretty happy with the way it&rsquo;s been going, so I wanted to
write a blog post to explain what I&rsquo;ve got so far.</p>
<p><strong>Rayon&rsquo;s goal is to make it easy to add parallelism to your
sequential code</strong> &ndash; so basically to take existing for loops or
iterators and make them run in parallel. For example, if you have an
existing iterator chain like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">total_price</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">stores</span><span class="p">.</span><span class="n">iter</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                        </span><span class="p">.</span><span class="n">map</span><span class="p">(</span><span class="o">|</span><span class="n">store</span><span class="o">|</span><span class="w"> </span><span class="n">store</span><span class="p">.</span><span class="n">compute_price</span><span class="p">(</span><span class="o">&amp;</span><span class="n">list</span><span class="p">))</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                        </span><span class="p">.</span><span class="n">sum</span><span class="p">();</span><span class="w">
</span></span></span></code></pre></div><p>then you could convert that to run in parallel just by changing from
the standard &ldquo;sequential iterator&rdquo; to Rayon&rsquo;s &ldquo;parallel iterator&rdquo;:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">total_price</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">stores</span><span class="p">.</span><span class="n">par_iter</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                        </span><span class="p">.</span><span class="n">map</span><span class="p">(</span><span class="o">|</span><span class="n">store</span><span class="o">|</span><span class="w"> </span><span class="n">store</span><span class="p">.</span><span class="n">compute_price</span><span class="p">(</span><span class="o">&amp;</span><span class="n">list</span><span class="p">))</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                        </span><span class="p">.</span><span class="n">sum</span><span class="p">();</span><span class="w">
</span></span></span></code></pre></div><p>Of course, part of making parallelism easy is making it safe. <strong>Rayon
guarantees you that using Rayon APIs will not introduce data races</strong>.</p>
<p>This blog post explains how Rayon works. It starts by describing the
core Rayon primitive (<code>join</code>) and explains how that is implemented.  I
look in particular at how many of Rust&rsquo;s features come together to let
us implement <code>join</code> with very low runtime overhead and with strong
safety guarantees. I then explain briefly how the parallel iterator
abstraction is built on top of <code>join</code>.</p>
<p>I do want to emphasize, though, that Rayon is very much &ldquo;work in
progress&rdquo;. I expect the design of the parallel iterator code in
particular to see a lot of, well, iteration (no pun intended), since
the current setup is not as flexible as I would like. There are also
various corner cases that are not correctly handled, notably around
panic propagation and cleanup. Still, Rayon is definitely usable today
for certain tasks. I&rsquo;m pretty excited about it, and I hope you will be
too!</p>
<!-- more -->
<h3 id="rayons-core-primitive-join">Rayon&rsquo;s core primitive: join</h3>
<p>In the beginning of this post, I showed an example of using a parallel
iterator to do a map-reduce operation:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">total_price</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">stores</span><span class="p">.</span><span class="n">par_iter</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                        </span><span class="p">.</span><span class="n">map</span><span class="p">(</span><span class="o">|</span><span class="n">store</span><span class="o">|</span><span class="w"> </span><span class="n">store</span><span class="p">.</span><span class="n">compute_price</span><span class="p">(</span><span class="o">&amp;</span><span class="n">list</span><span class="p">))</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                        </span><span class="p">.</span><span class="n">sum</span><span class="p">();</span><span class="w">
</span></span></span></code></pre></div><p>In fact, though, parallel iterators are just a small utility library
built atop a more fundamental primitive: <code>join</code>. The usage of <code>join</code>
is very simple. You invoke it with two closures, like shown below, and
it will <em>potentially</em> execute them in parallel. Once they have both
finished, it will return:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// `do_something` and `do_something_else` *may* run in parallel
</span></span></span><span class="line"><span class="cl"><span class="n">join</span><span class="p">(</span><span class="o">||</span><span class="w"> </span><span class="n">do_something</span><span class="p">(),</span><span class="w"> </span><span class="o">||</span><span class="w"> </span><span class="n">do_something_else</span><span class="p">())</span><span class="w">
</span></span></span></code></pre></div><p>The fact that the two closures <em>potentially</em> run in parallel is key:
<strong>the decision of whether or not to use parallel threads is made
dynamically, based on whether idle cores are available</strong>. The idea is
that you can basically annotate your programs with calls to <code>join</code> to
indicate where parallelism might be a good idea, and let the runtime
decide when to take advantage of that.</p>
<p>This approach of &ldquo;potential parallelism&rdquo; is, in fact, the key point of
difference between Rayon&rsquo;s approach and
<a href="https://github.com/aturon/crossbeam/blob/master/src/scoped.rs">crossbeam&rsquo;s scoped threads</a>. Whereas in crossbeam,
when you put two bits of work onto scoped threads, they will always
execute concurrently with one another, calling <code>join</code> in Rayon does
not necessarily imply that the code will execute in parallel. This not
only makes for a simpler API, it can make for more efficient
execution. This is because knowing when parallelism is profitable is
difficult to predict in advance, and always requires a certain amount
of global context: for example, does the computer have idle cores?
What other parallel operations are happening right now? <strong>In fact, one
of the main points of this post is to advocate for <em>potential
parallelism</em> as the basis for Rust data parallelism libraries</strong>, in
contrast to the <em>guaranteed concurrency</em> that we have seen thus far.</p>
<p>This is not to say that there is no role for guaranteed concurrency
like what crossbeam offers. &ldquo;Potential parallelism&rdquo; semantics also
imply some limits on what your parallel closures can do. For example,
if I try to use a channel to communicate between the two closures in
<code>join</code>, that will likely deadlock. The right way to think about <code>join</code>
is that it is a parallelization hint for an otherwise sequential
algorithm. Sometimes that&rsquo;s not what you want &ndash; some algorithms are
inherently <em>parallel</em>. (Note though that it is perfectly reasonable to
use types like <code>Mutex</code>, <code>AtomicU32</code>, etc from within a <code>join</code> call &ndash;
you just don&rsquo;t want one closure to <em>block</em> waiting for the other.)</p>
<h3 id="example-of-using-join-parallel-quicksort">Example of using join: parallel quicksort</h3>
<p><code>join</code> is a great primitive for &ldquo;divide-and-conquer&rdquo; algorithms. These
algorithms tend to divide up the work into two roughly equal parts and
then recursively process those parts. For example, we can implement a
<a href="https://github.com/nikomatsakis/rayon/blob/22f04aee0e12b31e029ec669299802d6e2f86bf6/src/test.rs#L6-L28">parallel version of quicksort</a> like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">quick_sort</span><span class="o">&lt;</span><span class="n">T</span>:<span class="nb">PartialOrd</span><span class="o">+</span><span class="nb">Send</span><span class="o">&gt;</span><span class="p">(</span><span class="n">v</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">])</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">if</span><span class="w"> </span><span class="n">v</span><span class="p">.</span><span class="n">len</span><span class="p">()</span><span class="w"> </span><span class="o">&gt;</span><span class="w"> </span><span class="mi">1</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">mid</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">partition</span><span class="p">(</span><span class="n">v</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="p">(</span><span class="n">lo</span><span class="p">,</span><span class="w"> </span><span class="n">hi</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">v</span><span class="p">.</span><span class="n">split_at_mut</span><span class="p">(</span><span class="n">mid</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">rayon</span>::<span class="n">join</span><span class="p">(</span><span class="o">||</span><span class="w"> </span><span class="n">quick_sort</span><span class="p">(</span><span class="n">lo</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                    </span><span class="o">||</span><span class="w"> </span><span class="n">quick_sort</span><span class="p">(</span><span class="n">hi</span><span class="p">));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">partition</span><span class="o">&lt;</span><span class="n">T</span>:<span class="nb">PartialOrd</span><span class="o">+</span><span class="nb">Send</span><span class="o">&gt;</span><span class="p">(</span><span class="n">v</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">])</span><span class="w"> </span>-&gt; <span class="kt">usize</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// see https://en.wikipedia.org/wiki/Quicksort#Lomuto_partition_scheme
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>In fact, the only difference between this version of quicksort and a
sequential one is that we call <code>rayon::join</code> at the end!</p>
<h3 id="how-join-is-implemented-work-stealing">How join is implemented: work-stealing</h3>
<p>Behind the scenes, <code>join</code> is implemented using a technique called
<strong>work-stealing</strong>. As far as I know, work stealing was first
introduced as part of the Cilk project, and it has since become a
fairly standard technique (in fact, the name Rayon is an homage to
Cilk).</p>
<p>The basic idea is that, on each call to <code>join(a, b)</code>, we have
identified two tasks <code>a</code> and <code>b</code> that could safely run in parallel,
but we don&rsquo;t know yet whether there are idle threads. All that the
current thread does is to add <code>b</code> into a local queue of &ldquo;pending work&rdquo;
and then go and immediately start executing <code>a</code>. Meanwhile, there is a
pool of other active threads (typically one per CPU, or something like
that). Whenever it is idle, each thread goes off to scour the &ldquo;pending
work&rdquo; queues of other threads: if they find an item there, then they
will steal it and execute it themselves. So, in this case, while the
first thread is busy executing <code>a</code>, another thread might come along
and start executing <code>b</code>.</p>
<p>Once the first thread finishes with <code>a</code>, it then checks: did somebody
else start executing <code>b</code> already? If not, we can execute it
ourselves. If so, we should wait for them to finish: but while we
wait, we can go off and steal from other processors, and thus try to
help drive the overall process towards completion.</p>
<p>In Rust-y pseudocode, <code>join</code> thus looks something like this (the
<a href="https://github.com/nikomatsakis/rayon/blob/22f04aee0e12b31e029ec669299802d6e2f86bf6/src/api.rs#L23-L64">actual code</a> works somewhat differently; for example, it allows
for each operation to have a result):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">join</span><span class="o">&lt;</span><span class="n">A</span><span class="p">,</span><span class="n">B</span><span class="o">&gt;</span><span class="p">(</span><span class="n">oper_a</span>: <span class="nc">A</span><span class="p">,</span><span class="w"> </span><span class="n">oper_b</span>: <span class="nc">B</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">where</span><span class="w"> </span><span class="n">A</span>: <span class="nb">FnOnce</span><span class="p">()</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Send</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="n">B</span>: <span class="nb">FnOnce</span><span class="p">()</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Send</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Advertise `oper_b` to other threads as something
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// they might steal:
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">job</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">push_onto_local_queue</span><span class="p">(</span><span class="n">oper_b</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Execute `oper_a` ourselves:
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">oper_a</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Check whether anybody stole `oper_b`:
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">if</span><span class="w"> </span><span class="n">pop_from_local_queue</span><span class="p">(</span><span class="n">oper_b</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// Not stolen, do it ourselves.
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">oper_b</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// Stolen, wait for them to finish. In the
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// meantime, try to steal from others:
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">while</span><span class="w"> </span><span class="n">not_yet_complete</span><span class="p">(</span><span class="n">job</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">steal_from_others</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">result_b</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">job</span><span class="p">.</span><span class="n">result</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>What makes work stealing so elegant is that it adapts naturally to the
CPU&rsquo;s load. That is, if all the workers are busy, then <code>join(a, b)</code>
basically devolves into executing each closure sequentially (i.e.,
<code>a(); b();</code>). This is no worse than the sequential code. But if there
<em>are</em> idle threads available, then we get parallelism.</p>
<h3 id="performance-measurements">Performance measurements</h3>
<p>Rayon is still fairly young, and I don&rsquo;t have a lot of sample programs
to test (nor have I spent a lot of time tuning it). Nonetheless, you
can get pretty decent speedups even today, but it does take a <em>bit</em>
more tuning than I would like. For example, with a
<a href="https://github.com/nikomatsakis/rayon/blob/22f04aee0e12b31e029ec669299802d6e2f86bf6/demo/quicksort/src/main.rs#L47-L60">tweaked version of quicksort</a>, I see the following
<a href="https://en.wikipedia.org/wiki/Speedup#Speedup_in_latency">parallel speedups</a> on my 4-core Macbook Pro (hence, 4x is
basically the best you could expect):</p>
<table class="ndm">
<tr> <th>Array Length</th> <th>Speedup</th> </tr>
<tr> <td> 1K         </td> <td>0.95x   </td> </tr>
<tr> <td> 32K        </td> <td>2.19x   </td> </tr>
<tr> <td> 64K        </td> <td>3.09x   </td> </tr>
<tr> <td> 128K       </td> <td>3.52x   </td> </tr>
<tr> <td> 512K       </td> <td>3.84x   </td> </tr>
<tr> <td> 1024K      </td> <td>4.01x   </td> </tr>
</table>
<p></p>
<p>The change that I made from the original version is to introduce
<em>sequential fallback</em>. Basically, we just check if we have a small
array (in my code, less than 5K elements). If so, we fallback to a
sequential version of the code that never calls <code>join</code>. This can
actually be done without any code duplication using traits, as you can
see from <a href="https://github.com/nikomatsakis/rayon/blob/22f04aee0e12b31e029ec669299802d6e2f86bf6/demo/quicksort/src/main.rs#L47-L60">the demo code</a>. (If you&rsquo;re curious, I explain the idea
in an appendix below.)</p>
<p>Hopefully, further optimizations will mean that sequential fallback is
less necessary &ndash; but it&rsquo;s worth pointing out that higher-level APIs
like the parallel iterator I alluded to earlier can also handle the
sequential fallback for you, so that you don&rsquo;t have to actively think
about it.</p>
<p>In any case, if you <strong>don&rsquo;t</strong> do sequential fallback, then the results
you see are not as good, though they could be a lot worse:</p>
<table class="ndm">
<tr> <th>Array Length</th> <th>Speedup</th> </tr>
<tr> <td> 1K         </td> <td>0.41x   </td> </tr>
<tr> <td> 32K        </td> <td>2.05x   </td> </tr>
<tr> <td> 64K        </td> <td>2.42x   </td> </tr>
<tr> <td> 128K       </td> <td>2.75x   </td> </tr>
<tr> <td> 512K       </td> <td>3.02x   </td> </tr>
<tr> <td> 1024K      </td> <td>3.10x   </td> </tr>
</table>
<p></p>
<p>In particular, keep in mind that this version of the code is <strong>pushing
a parallel task for all subarrays down to length 1</strong>. If the array is
512K or 1024K, that&rsquo;s a lot of subarrays and hence a lot of task
pushing, but we still get a speedup of 3.10x. I think the reason that
the code does as well as it does is because it gets the &ldquo;big things&rdquo;
right &ndash; that is, Rayon avoids memory allocation and virtual dispatch,
as described in the next section. Still, I would like to do better
than
0.41x for a 1K array (and I think we can).</p>
<h3 id="taking-advantage-of-rust-features-to-minimize-overhead">Taking advantage of Rust features to minimize overhead</h3>
<p>As you can see above, to make this scheme work, you really want to
drive down the overhead of pushing a task onto the local queue. After
all, the expectation is that most tasks will <em>never</em> be stolen,
because there are far fewer processors than there are tasks. Rayon&rsquo;s
API is designed to leverage several Rust features and drive this
overhead down:</p>
<ul>
<li><code>join</code> is defined generically with respect to the closure types of
its arguments. This means that monomorphization will generate a
distinct copy of <code>join</code> <strong>specialized to each callsite</strong>. This in turn
means that when <code>join</code> invokes <code>oper_a()</code> and <code>oper_b()</code> (as opposed
to the relatively rare case where they are stolen), those calls are
statically dispatched, which means that they can be inlined.
It also means that creating a closure requires no allocation.</li>
<li>Because <code>join</code> blocks until both of its closures are finished, we
are able to make <strong>full use of stack allocation</strong>. This is good both
for users of the API and for the implementation: for example, the
quicksort example above relied on being able to access an <code>&amp;mut [T]</code>
slice that was provided as input, which only works because <code>join</code>
blocks. Similarly, the implementation of <code>join</code> itself is able to
<strong>completely avoid heap allocation</strong> and instead rely solely on the
stack (e.g., the closure objects that we place into our local work
queue are allocated on the stack).</li>
</ul>
<p>As you saw above, the overhead for pushing a task is reasonably low,
though not nearly as low as I would like. There are various ways to
reduce it further:</p>
<ul>
<li>Many work-stealing implementations use heuristics to try and decide
when to skip the work of pushing parallel tasks. For example, the
<a href="http://dl.acm.org/citation.cfm?id=2629643">Lazy Scheduling</a> work by Tzannes et al. tries to avoid pushing a
task at all unless there are idle worker threads (which they call
&ldquo;hungry&rdquo; threads) that might steal it.</li>
<li>And of course good ol&rsquo; fashioned optimization would help. I&rsquo;ve never
even <em>looked</em> at the generated LLVM bitcode or assembly for <code>join</code>,
for example, and it seems likely that there is low-hanging fruit
there.</li>
</ul>
<h3 id="data-race-freedom">Data-race freedom</h3>
<p>Earlier I mentioned that Rayon also guarantees data-race freedom.
This means that you can add parallelism to previously sequential code
without worrying about introducing weird, hard-to-reproduce bugs.</p>
<p>There are two kinds of mistakes we have to be concerned about.  First,
the two closures might share some mutable state, so that changes made
by one would affect the other. For example, if I modify the above
example so that it (incorrectly) calls <code>quick_sort</code> on <code>lo</code> in both
closures, then I would hope that this will not compile:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">quick_sort</span><span class="o">&lt;</span><span class="n">T</span>:<span class="nb">PartialOrd</span><span class="o">+</span><span class="nb">Send</span><span class="o">&gt;</span><span class="p">(</span><span class="n">v</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">])</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">if</span><span class="w"> </span><span class="n">v</span><span class="p">.</span><span class="n">len</span><span class="p">()</span><span class="w"> </span><span class="o">&gt;</span><span class="w"> </span><span class="mi">1</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">mid</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">partition</span><span class="p">(</span><span class="n">v</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="p">(</span><span class="n">lo</span><span class="p">,</span><span class="w"> </span><span class="n">hi</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">v</span><span class="p">.</span><span class="n">split_at_mut</span><span class="p">(</span><span class="n">mid</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">rayon</span>::<span class="n">join</span><span class="p">(</span><span class="o">||</span><span class="w"> </span><span class="n">quick_sort</span><span class="p">(</span><span class="n">lo</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                    </span><span class="o">||</span><span class="w"> </span><span class="n">quick_sort</span><span class="p">(</span><span class="n">lo</span><span class="p">));</span><span class="w"> </span><span class="c1">// &lt;-- oops
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>And indeed I will see the following error:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="cl">test.rs:14:10: 14:27 error: closure requires unique access to `lo` but it is already borrowed [E0500]
</span></span><span class="line"><span class="cl">test.rs:14          || quick_sort(lo));
</span></span><span class="line"><span class="cl">                    ^~~~~~~~~~~~~~~~~
</span></span></code></pre></div><p>Similar errors arise if I try to have one closure process <code>lo</code> (or
<code>hi</code>) and the other process <code>v</code>, which overlaps with both of them.</p>
<p><em>Side note:</em> This example may seem artificial, but in fact this is an
actual bug that I made (or rather, would have made) while implementing
the parallel iterator abstraction I describe later. It&rsquo;s very easy to
make these sorts of copy-and-paste errors, and it&rsquo;s very nice that
Rust makes this kind of error a non-event, rather than a crashing bug.</p>
<p>Another kind of bug one might have is to use a non-threadsafe type
from within one of the <code>join</code> closures. For example, Rust offers a
<a href="http://doc.rust-lang.org/std/rc/struct.Rc.html">non-atomic reference-counted type</a> called <code>Rc</code>. Because <code>Rc</code> uses
non-atomic instructions to update the reference counter, it is not
safe to share an <code>Rc</code> between threads. If one were to do so, as I show
in the following example, the ref count could easily become incorrect,
which would lead to double frees or worse:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">share_rc</span><span class="o">&lt;</span><span class="n">T</span>:<span class="nb">PartialOrd</span><span class="o">+</span><span class="nb">Send</span><span class="o">&gt;</span><span class="p">(</span><span class="n">rc</span>: <span class="nc">Rc</span><span class="o">&lt;</span><span class="kt">i32</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// In the closures below, the calls to `clone` increment the
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// reference count. These calls MIGHT execute in parallel.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Would not be good!
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">rayon</span>::<span class="n">join</span><span class="p">(</span><span class="o">||</span><span class="w"> </span><span class="n">something</span><span class="p">(</span><span class="n">rc</span><span class="p">.</span><span class="n">clone</span><span class="p">()),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="o">||</span><span class="w"> </span><span class="n">something</span><span class="p">(</span><span class="n">rc</span><span class="p">.</span><span class="n">clone</span><span class="p">()));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>But of course if I try that example, I get a compilation error:</p>
<pre tabindex="0"><code>test.rs:14:5: 14:9 error: the trait `core::marker::Sync` is not implemented
                          for the type `alloc::rc::Rc&lt;i32&gt;` [E0277]
test.rs:14     rayon::join(|| something(rc.clone()),
               ^~~~~~~~~~~
test.rs:14:5: 14:9 help: run `rustc --explain E0277` to see a detailed explanation
test.rs:14:5: 14:9 note: `alloc::rc::Rc&lt;i32&gt;` cannot be shared between threads safely
</code></pre><p>As you can see in the final &ldquo;note&rdquo;, the compiler is telling us that
you cannot share <code>Rc</code> values across threads.</p>
<p>So you might wonder what kind of deep wizardry is required for the
<code>join</code> function to enforce both of these invariants? In fact, the
answer is surprisingly simple. The first error, which I got when I
shared the same <code>&amp;mut</code> slice across two closures, falls out from
Rust&rsquo;s basic type system: you cannot have two closures that are both
in scope at the same time and both access the same <code>&amp;mut</code> slice. This
is because <code>&amp;mut</code> data is supposed to be <em>uniquely</em> accessed, and
hence if you had two closures, they would both have access to the same
&ldquo;unique&rdquo; data. Which of course makes it not so unique.</p>
<p>(In fact, this was one of the <a href="http://smallcultfollowing.com/babysteps/blog/2013/06/11/on-the-connection-between-memory-management-and-data-race-freedom/">great epiphanies for me</a> in
working on Rust&rsquo;s type system. Previously I thought that &ldquo;dangling
pointers&rdquo; in sequential programs and &ldquo;data races&rdquo; were sort of
distinct bugs: but now I see them as two heads of the same Hydra.
Basically both are caused by having rampant aliasing and mutation, and
both can be solved by the ownership and borrowing. Nifty, no?)</p>
<p>So what about the second error, the one I got for sending an <code>Rc</code>
across threads? This occurs because the <code>join</code> function declares that
its two closures must be <code>Send</code>. <code>Send</code> is the Rust name for a trait
that indicates whether data can be safely transferred across
threads. So when <code>join</code> declares that its two closures must be <code>Send</code>,
it is saying &ldquo;it must be safe for the data those closures can reach to
be transferred to another thread and back again&rdquo;.</p>
<h3 id="parallel-iterators">Parallel iterators</h3>
<p>At the start of this post, I gave an example of using a parallel
iterator:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="n">total_price</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">stores</span><span class="p">.</span><span class="n">par_iter</span><span class="p">()</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                        </span><span class="p">.</span><span class="n">map</span><span class="p">(</span><span class="o">|</span><span class="n">store</span><span class="o">|</span><span class="w"> </span><span class="n">store</span><span class="p">.</span><span class="n">compute_price</span><span class="p">(</span><span class="o">&amp;</span><span class="n">list</span><span class="p">))</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                        </span><span class="p">.</span><span class="n">sum</span><span class="p">();</span><span class="w">
</span></span></span></code></pre></div><p>But since then, I&rsquo;ve just focused on <code>join</code>. As I mentioned earlier,
the parallel iterator API is really just a
<a href="https://github.com/nikomatsakis/rayon/tree/22f04aee0e12b31e029ec669299802d6e2f86bf6/src/par_iter">pretty simple wrapper</a> around <code>join</code>. At the moment, it&rsquo;s
more of a proof of concept than anything else. But what&rsquo;s really nifty
about it is that it does not require <em>any</em> unsafe code related to
parallelism &ndash; that is, it just builds on <code>join</code>, which encapsulates
all of the unsafety. (To be clear, there <em>is</em> a small amount of unsafe
code related to <a href="https://github.com/nikomatsakis/rayon/blob/22f04aee0e12b31e029ec669299802d6e2f86bf6/src/par_iter/collect.rs#L27-L47">managing uninitialized memory</a> when
collecting into a vector. But this has nothing to do with
<em>parallelism</em>; you&rsquo;ll find similar code in <code>Vec</code>. This code is also
wrong in some edge cases because I&rsquo;ve not had time to do it properly.)</p>
<p>I don&rsquo;t want to go too far into the details of the existing parallel
iterator code because I expect it to change. But the high-level idea
is that we have this trait <code>ParallelIterator</code> which
<a href="https://github.com/nikomatsakis/rayon/blob/22f04aee0e12b31e029ec669299802d6e2f86bf6/src/par_iter/mod.rs#L30-L35">has the following core members</a>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">trait</span><span class="w"> </span><span class="n">ParallelIterator</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Shared</span>: <span class="nb">Sync</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">State</span>: <span class="nc">ParallelIteratorState</span><span class="o">&lt;</span><span class="n">Shared</span><span class="o">=</span><span class="bp">Self</span>::<span class="n">Shared</span><span class="p">,</span><span class="w"> </span><span class="n">Item</span><span class="o">=</span><span class="bp">Self</span>::<span class="n">Item</span><span class="o">&gt;</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Send</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">state</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="p">(</span><span class="bp">Self</span>::<span class="n">Shared</span><span class="p">,</span><span class="w"> </span><span class="bp">Self</span>::<span class="n">State</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="c1">// some uninteresting helper methods, like `map` etc
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The idea is that the method <code>state</code> divides up the iterator into some
shared state and some &ldquo;per-thread&rdquo; state. The shared state will
(potentially) be accessible by all worker threads, so it must be
<code>Sync</code> (sharable across threads). The per-thread-safe will be split
for each call to <code>join</code>, so it only has to be <code>Send</code> (transferrable to
a single other thread).</p>
<p>The <a href="https://github.com/nikomatsakis/rayon/blob/22f04aee0e12b31e029ec669299802d6e2f86bf6/src/par_iter/mod.rs#L80-L90"><code>ParallelIteratorState</code> trait</a> represents some
chunk of the remaining work (e.g., a subslice to be processed). It has
three methods:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">trait</span><span class="w"> </span><span class="n">ParallelIteratorState</span>: <span class="nb">Sized</span> <span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Shared</span>: <span class="nb">Sync</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">len</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">ParallelLen</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">split_at</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">index</span>: <span class="kt">usize</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="p">(</span><span class="bp">Self</span><span class="p">,</span><span class="w"> </span><span class="bp">Self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">for_each</span><span class="o">&lt;</span><span class="no">OP</span><span class="o">&gt;</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">shared</span>: <span class="kp">&amp;</span><span class="nc">Self</span>::<span class="n">Shared</span><span class="p">,</span><span class="w"> </span><span class="n">op</span>: <span class="nc">OP</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">where</span><span class="w"> </span><span class="no">OP</span>: <span class="nb">FnMut</span><span class="p">(</span><span class="bp">Self</span>::<span class="n">Item</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The <code>len</code> method gives an idea of how much work remains. The
<code>split_at</code> method divides this state into two other pieces.  The
<code>for_each</code> method produces all the values in this chunk of the
iterator. So, for example, the parallel iterator for a slice <code>&amp;[T]</code>
would:</p>
<ul>
<li>implement <a href="https://github.com/nikomatsakis/rayon/blob/22f04aee0e12b31e029ec669299802d6e2f86bf6/src/par_iter/slice.rs#L30-L36"><code>len</code></a> by just returning the length of the slice,</li>
<li>implement <a href="https://github.com/nikomatsakis/rayon/blob/22f04aee0e12b31e029ec669299802d6e2f86bf6/src/par_iter/slice.rs#L38-L41"><code>split_at</code></a> by splitting the slice into two subslices,</li>
<li>and implement <a href="https://github.com/nikomatsakis/rayon/blob/22f04aee0e12b31e029ec669299802d6e2f86bf6/src/par_iter/slice.rs#L43-L49"><code>for_each</code></a> by iterating over the array and
invoking <code>op</code> on each element.</li>
</ul>
<p>Given these two traits, we can implement a parallel operation like
collection by following the same basic template. We check how much
work there is: if it&rsquo;s too much, we split into two pieces. Otherwise,
we process sequentially (note that this automatically incorporates the
sequential fallback we saw before):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">process</span><span class="p">(</span><span class="n">shared</span><span class="p">,</span><span class="w"> </span><span class="n">state</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">if</span><span class="w"> </span><span class="n">state</span><span class="p">.</span><span class="n">len</span><span class="p">()</span><span class="w"> </span><span class="n">is</span><span class="w"> </span><span class="n">too</span><span class="w"> </span><span class="n">big</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// parallel split
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">midpoint</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">state</span><span class="p">.</span><span class="n">len</span><span class="p">()</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="mi">2</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="p">(</span><span class="n">state1</span><span class="p">,</span><span class="w"> </span><span class="n">state2</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">state</span><span class="p">.</span><span class="n">split_at</span><span class="p">(</span><span class="n">midpoint</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">rayon</span>::<span class="n">join</span><span class="p">(</span><span class="o">||</span><span class="w"> </span><span class="n">process</span><span class="p">(</span><span class="n">shared</span><span class="p">,</span><span class="w"> </span><span class="n">state1</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="o">||</span><span class="w"> </span><span class="n">process</span><span class="p">(</span><span class="n">shared</span><span class="p">,</span><span class="w"> </span><span class="n">state2</span><span class="p">));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// sequential base case
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">state</span><span class="p">.</span><span class="n">for_each</span><span class="p">(</span><span class="o">|</span><span class="n">item</span><span class="o">|</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// process item
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">})</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Click these links, for example, to see the code to
<a href="https://github.com/nikomatsakis/rayon/blob/22f04aee0e12b31e029ec669299802d6e2f86bf6/src/par_iter/collect.rs#L27-L47">collect into a vector</a> or to
<a href="https://github.com/nikomatsakis/rayon/blob/22f04aee0e12b31e029ec669299802d6e2f86bf6/src/par_iter/reduce.rs#L20-L42">reduce a stream of values into one</a>.</p>
<h3 id="conclusions-and-a-historical-note">Conclusions and a historical note</h3>
<p>I&rsquo;m pretty excited about this latest iteration of Rayon. It&rsquo;s dead
simple to use, very expressive, and I think it has a lot of potential
to be very efficient.</p>
<p>It&rsquo;s also very gratifying to see how elegant data parallelism in Rust
has become. This is the result of a long evolution and a lot of
iteration. In Rust&rsquo;s early days, for example, it took a strict,
Erlang-like approach, where you just had parallel tasks communicating
over channels, with no shared memory. This is good for the high-levels
of your application, but not so good for writing a parallel
quicksort. Gradually though, as we refined the type system, we got
closer and closer to a smooth version of parallel quicksort.</p>
<p>If you look at some of my <a href="http://smallcultfollowing.com/babysteps/blog/2013/06/11/data-parallelism-in-rust/">earlier</a> <a href="http://smallcultfollowing.com/babysteps/blog/2012/06/11/hotpar/">designs</a>,
it should be clear that the current iteration of <code>Rayon</code> is by far the
smoothest yet. What I particularly like is that it is simple for
users, but also simple for <em>implementors</em> &ndash; that is, it doesn&rsquo;t
require any crazy Rust type system tricks or funky traits to achieve
safety here.  I think this is largely due to two key developments:</p>
<ul>
<li><a href="http://smallcultfollowing.com/babysteps/blog/2012/11/18/imagine-never-hearing-the-phrase-aliasable/">&ldquo;INHTWAMA&rdquo;</a>, which was the decision to make <code>&amp;mut</code> references
be non-aliasable and to remove <code>const</code> (read-only, but not
immutable) references. This basically meant that Rust authors were
now writing data-race-free code <em>by default</em>.</li>
<li><a href="https://github.com/rust-lang/rfcs/blob/master/text/0458-send-improvements.md">Improved Send traits</a>, or RFC 458, which modified the <code>Send</code>
trait to permit borrowed references. Prior to this RFC, which was
authored by <a href="https://github.com/pythonesque">Joshua Yanovski</a>, we had the constraint that for
data to be <code>Send</code>, it had to be <code>'static</code> &ndash; meaning it could not
have any references into the stack. This was a holdover from the
Erlang-like days, when all threads were independent, asynchronous
workers, but none of us saw it. This led to some awful contortions
in my early designs to try to find alternate traits to express the
idea of data that was threadsafe but also contained stack
references. Thankfully Joshua had the insight that simply removing
the <code>'static</code> bound would make this all much smoother!</li>
</ul>
<h3 id="appendix-implementing-sequential-fallback-without-code-duplication">Appendix: Implementing sequential fallback without code duplication</h3>
<p>Earlier, I mentioned that for peak performance in the quicksort demo,
you want to fallback to sequential code if the array size is too
small. It would be a drag to have to have two copies of the quicksort
routine. Fortunately, we can use Rust traits to generate those
two copies automatically from a single source. This appendix explains
the <a href="https://github.com/nikomatsakis/rayon/blob/22f04aee0e12b31e029ec669299802d6e2f86bf6/demo/quicksort/src/main.rs#L47-L60">trick that I used in the demo code</a>.</p>
<p>First, you define a trait <code>Joiner</code> that abstracts over the <code>join</code>
function:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Joiner</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="sd">/// True if this is parallel mode, false otherwise.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">is_parallel</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="kt">bool</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="sd">/// Either calls `rayon::join` or just invokes `oper_a(); oper_b();`.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">join</span><span class="o">&lt;</span><span class="n">A</span><span class="p">,</span><span class="no">R_A</span><span class="p">,</span><span class="n">B</span><span class="p">,</span><span class="no">R_B</span><span class="o">&gt;</span><span class="p">(</span><span class="n">oper_a</span>: <span class="nc">A</span><span class="p">,</span><span class="w"> </span><span class="n">oper_b</span>: <span class="nc">B</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="p">(</span><span class="no">R_A</span><span class="p">,</span><span class="w"> </span><span class="no">R_B</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">where</span><span class="w"> </span><span class="n">A</span>: <span class="nb">FnOnce</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">R_A</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Send</span><span class="p">,</span><span class="w"> </span><span class="n">B</span>: <span class="nb">FnOnce</span><span class="p">()</span><span class="w"> </span>-&gt; <span class="nc">R_B</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nb">Send</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This <code>Joiner</code> trait has two implementations, corresponding to
sequential and parallel mode:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Parallel</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Joiner</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Parallel</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Sequential</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Joiner</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Sequential</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now we can rewrite <code>quick_sort</code> to be generic over a type <code>J: Joiner</code>,
indicating whether this is the parallel or sequential implementation.
The parallel version will, for small arrays, convert over to
sequential mode:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">quick_sort</span><span class="o">&lt;</span><span class="n">J</span>:<span class="nc">Joiner</span><span class="p">,</span><span class="w"> </span><span class="n">T</span>:<span class="nb">PartialOrd</span><span class="o">+</span><span class="nb">Send</span><span class="o">&gt;</span><span class="p">(</span><span class="n">v</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">])</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">if</span><span class="w"> </span><span class="n">v</span><span class="p">.</span><span class="n">len</span><span class="p">()</span><span class="w"> </span><span class="o">&gt;</span><span class="w"> </span><span class="mi">1</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Fallback to sequential for arrays less than 5K in length:
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">if</span><span class="w"> </span><span class="n">J</span>::<span class="n">is_parallel</span><span class="p">()</span><span class="w"> </span><span class="o">&amp;&amp;</span><span class="w"> </span><span class="n">v</span><span class="p">.</span><span class="n">len</span><span class="p">()</span><span class="w"> </span><span class="o">&lt;=</span><span class="w"> </span><span class="mi">5</span><span class="o">*</span><span class="mi">1024</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="k">return</span><span class="w"> </span><span class="n">quick_sort</span>::<span class="o">&lt;</span><span class="n">Sequential</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="n">v</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                                
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">mid</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">partition</span><span class="p">(</span><span class="n">v</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="p">(</span><span class="n">lo</span><span class="p">,</span><span class="w"> </span><span class="n">hi</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">v</span><span class="p">.</span><span class="n">split_at_mut</span><span class="p">(</span><span class="n">mid</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">J</span>::<span class="n">join</span><span class="p">(</span><span class="o">||</span><span class="w"> </span><span class="n">quick_sort</span>::<span class="o">&lt;</span><span class="n">J</span><span class="p">,</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="n">lo</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="o">||</span><span class="w"> </span><span class="n">quick_sort</span>::<span class="o">&lt;</span><span class="n">J</span><span class="p">,</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="n">hi</span><span class="p">));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div>]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/></entry><entry><title type="html">Virtual Structs Part 4: Extended Enums And Thin Traits</title><link href="https://smallcultfollowing.com/babysteps/blog/2015/10/08/virtual-structs-part-4-extended-enums-and-thin-traits/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2015/10/08/virtual-structs-part-4-extended-enums-and-thin-traits/</id><published>2015-10-08T00:00:00+00:00</published><updated>2015-10-08T12:46:29-04:00</updated><content type="html"><![CDATA[<p>So, aturon wrote this <a href="http://aturon.github.io/blog/2015/09/18/reuse/">interesting post</a> on an alternative
&ldquo;virtual structs&rdquo; approach, and, more-or-less since he wrote it, I&rsquo;ve
been wanting to write up my thoughts. I finally got them down.</p>
<p>Before I go any further, a note on terminology. I will refer to
Aaron&rsquo;s proposal as <a href="http://aturon.github.io/blog/2015/09/18/reuse/">the Thin Traits proposal</a>, and my own
previous proposal as <a href="http://smallcultfollowing.com/babysteps/blog/2015/08/20/virtual-structs-part-3-bringing-enums-and-structs-together/">the Extended Enums proposal</a>. Very good.</p>
<p>(OK, I lied, one more note: starting with this post, I&rsquo;ve decided to
disable comments on this blog. There are just too many forums to keep
up with! So if you want to discuss this post, I&rsquo;d recommend doing so
on <a href="https://internals.rust-lang.org/t/blog-post-extended-enums-and-thin-traits/2755">this Rust internals thread</a>.)</p>
<h3 id="conclusion">Conclusion</h3>
<p>Let me lead with my conclusion: <strong>while I still want the Extended
Enums proposal, I <em>lean</em> towards implementing the Thin Traits
proposal now, and returning to something like Extended Enums
afterwards (or at some later time)</strong>. My reasoning is that the Thin
Traits proposal can be seen as a design pattern lying latent in the
Extended Enums proposal. Basically, once we implement
<a href="https://github.com/rust-lang/rfcs/pull/1210">specialization</a>, which I want for a wide variety of reasons, we
<em>almost</em> get Thin Traits for free. And the Thin Traits pattern is
useful enough that it&rsquo;s worth taking that extra step.</p>
<p>Now, since the Thin Traits and Extended Enums proposal appear to be
alternatives, you may wonder why I would think there is value in
potentially implementing both. The way I see it, they target different
things. Thin Traits gives you a way to very precisely fashion
something that acts like a C++ or Java class. This means you get thin
pointers, inherited fields and behavior, and you even get open
extensibility (but, note, you thus do not get downcasting).</p>
<p>Extended Enums, in contrast, is targeting the &ldquo;fixed domain&rdquo; use case,
where you have a defined set of possibilities. This is what we use
enums for today, but (for the reasons I outlined before) there are
various places that we could improve, and that was what the extended
enums proposal was all about. One advantage of targeting the fixed
domain use case is that you get additional power, such as the ability
to do match statements, or to use inheritance when implementing any
trait at all (more details on this last point below).</p>
<p>To put it another way: with Thin Traits, you write virtual methods
whereas with Extensible Enums, you write match statements &ndash; and I
think match statements are far more common in Rust today.</p>
<p>Still, Thin Traits will be a very good fit for various use cases.
They are a good fit for Servo, for example, where they can be put to
use modeling the DOM. The extensibility here is probably a plus, if
not a hard requirement, because it means Servo can spread the DOM
across multiple crates. Another place that they might (maybe?) be
useful is if we want to have a stable interface to the AST someday
(though for that I think I would favor something like <a href="https://github.com/rust-lang/rfcs/pull/757">RFC 757</a>).</p>
<p>But I think there a bunch of use cases for extensible enums that thin
traits don&rsquo;t cover at all. For example, I don&rsquo;t see us using thin
traits in the compiler very much, nor do I see much of a role for them
in LALRPOP, etc. In all these cases, the open-ended extensibility of
Thin Traits is not needed and being able to exhaustively match is key.
Refinement types would also be very welcome.</p>
<p>Which brings me to my final thought. The Extended Enums proposal,
while useful, was not perfect. It had some rough spots we were not
happy with (which I&rsquo;ll discuss later on). Deferring the proposal gives
us time to find new solutions to those aspects. Often I find that when
I revisit a troublesome feature after letting it sit for some time, I
find that either (1) the problem I thought there was no longer bothers
me or (2) the feature isn&rsquo;t that important anyway or (3) there is now
a solution that was either previously not possible or which just never
occurred to me.</p>
<p>OK, so, with that conclusion out of the way, the post continues by
examining some of the rough spots in the Extended Enums proposal, and
then looking at how we can address those by taking an approach like
the one described in Thin Traits.</p>
<!-- more -->
<h3 id="thesis-extended-enums">Thesis: Extended Enums</h3>
<p>Let&rsquo;s start by reviewing a bit of the
<a href="http://smallcultfollowing.com/babysteps/blog/2015/08/20/virtual-structs-part-3-bringing-enums-and-structs-together/">Extended Enums proposal</a>. Extended Enums, as you may recall,
proposed making types for each of the enum variants, and allowing them
to be structured in a hierarchy.  It also proposed permitting enums to
be declared as &ldquo;unsized&rdquo;, which meant that the size of the enum type
varies depending on what variant a particular instance is.</p>
<p>In that proposal, I used a syntax where enums could have a list of
common fields declared in the body of the enum:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">enum</span> <span class="nc">TypeData</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Common fields:
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">id</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">flags</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Variants:
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Int</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">},</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Uint</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">},</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Ref</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">referent_ty</span>: <span class="nc">Ty</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="w"> </span><span class="p">},</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>One could also declare the variants out of line, as in this example:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kr">unsized</span><span class="w"> </span><span class="k">enum</span> <span class="nc">Node</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">position</span>: <span class="nc">Rectangle</span><span class="p">,</span><span class="w"> </span><span class="c1">// &lt;-- common fields, but no variants
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">enum</span> <span class="nc">Element</span>: <span class="nc">Node</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">TextElement</span>: <span class="nc">Element</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span></code></pre></div><p>Note that in this model, the &ldquo;variants&rdquo;, or leaf nodes in the type
hierarchy, are always structs. The inner nodes of the hierarchy (those
with children) are enums.</p>
<p>In order to support the <a href="http://smallcultfollowing.com/babysteps/blog/2015/05/29/classes-strike-back/#problem-3-initialization-of-common-fields">abstraction of constructors</a>, the
proposal includes a special associated type that lets you pull out a
struct <a href="http://smallcultfollowing.com/babysteps/blog/2015/08/20/virtual-structs-part-3-bringing-enums-and-structs-together/#associated-structs-constructors">containing the common fields from an enum</a>. For
example, <code>Node::struct</code> would correspond to a struct like</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">NodeFields</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">position</span>: <span class="nc">Rectangle</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h4 id="complications-with-common-fields">Complications with common fields</h4>
<p>The original post glossed over certain complications that arise around
common fields. Let me outline some of those complications. To start,
the associated <code>struct</code> type has always been a bit odd. It&rsquo;s just an
unusual bit of syntax, for one thing. But also, the fact that this
struct is not declared by the user raises some thorny questions. For
example, are the fields declared as public or private?  Can we
implement traits for this associated <code>struct</code> type? And so forth.</p>
<p>There are similar questions raised about the common fields in the enum
itself. In a struct, fields are private by default, and must be
declared as public (even if the struct is public):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Foo</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="c1">// the struct is public...
</span></span></span><span class="line"><span class="cl"><span class="w">   </span><span class="n">f</span>: <span class="kt">i32</span>        <span class="c1">// ...but its fields are private.
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>But in an enum, variants (and their fields) are public if the enum is
public:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">enum</span> <span class="nc">Foo</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="c1">// the enum is public...
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Variant1</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">f</span>: <span class="kt">i32</span> <span class="p">},</span><span class="w"> </span><span class="c1">// ...and so are its variants, and their fields.
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This default matches how enums and structs are typically used: public
structs are used to form abstraction barriers, and public enums are
exposed in order to allow the outside world to match against the
various cases. (We used to make the fields of public structs be public
as well, but we found that in practice the overwhelming majority were
just declared as private.)</p>
<p>However, these defaults are somewhat problematic for common fields.
For example, let&rsquo;s look at that DOM example again:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kr">unsized</span><span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">enum</span> <span class="nc">Node</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">position</span>: <span class="nc">Rectangle</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This field is declared in an enum, and that enum is public. So should
the field <code>position</code> be public or private? I would argue that this
enum is more &ldquo;struct-like&rdquo; in its usage pattern, and the default
should be private. We could arrive at this by adjusting the defaults
based on whether the enum declares its variant inline or out of
line. I expect this would actually match pretty well with actual
usage, but you can see that this is a somewhat subtle rule.</p>
<h3 id="antithesis-thin-traits">Antithesis: Thin Traits</h3>
<p>Now let me pivot for a bit and discuss the Thin Traits proposal.  In
particular, let&rsquo;s revisit the DOM hierarchy that we saw before
(<code>Node</code>, <code>Element</code>, etc), and see how that gets modeled. In the thin
traits proposal, every logical &ldquo;class&rdquo; consists of two types. The
first is a struct that defines its common fields and the second is a
trait that defines any virtual methods. So, the root of a DOM might be
a <code>Node</code> type, modeled like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">NodeFields</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">id</span>: <span class="kt">u32</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="cp">#[repr(thin)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Node</span>: <span class="nc">NodeFields</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">something</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">something_else</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The struct <code>NodeFields</code> here just represents the set of fields that
all nodes must have. Because it is declared as a superbound of <code>Node</code>,
that means that any type which implements <code>Node</code> must have
<code>NodeFields</code> as a prefix. As a result, if we have a <code>&amp;Node</code> object, we
can access the fields from <code>NodeFields</code> at no overhead, even without
knowing the precise type of the implementor.</p>
<p>(Furthermore, because <code>Node</code> was declared as a thin trait, a <code>&amp;Node</code>
pointer can be a thin pointer, and not a fat pointer. This does mean
that <code>Node</code> can only be implemented for local types. Note though that
you could use this same pattern without declaring <code>Node</code> as a thin
trait and it would still work, it&rsquo;s just that <code>&amp;Node</code> references would
be fat pointers.)</p>
<p>The <code>Node</code> trait shown had two virtual methods, <code>something()</code> and
<code>something_else()</code>.  Using specialization, we can provide a default
impl that lets us give some default behavior there, but also allows
subclasses to override that behavior:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">partial</span><span class="w"> </span><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span>:<span class="nc">Node</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Node</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">something</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// Here something_else() is not defined, so it is &#34;pure virtual&#34;
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">something_else</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Finally, if we have some methods that we would like to dispatch
statically on <code>Node</code>, we can do that by using an inherent method:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Node</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">get_id</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">u32</span> <span class="p">{</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">id</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This impl looks similar to the partial impl above, but in fact it is
not an impl <em>of</em> the trait <code>Node</code>, but rather adding inherent methods
that apply to <code>Node</code> objects. So if we call <code>node.get_id()</code> it doesn&rsquo;t
go through any virtual dispatch at all.</p>
<p>You can continue this pattern to create subclasses. So adding an
<code>Element</code> subclass might look like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">ElementFields</span>: <span class="nc">NodeFields</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="o">..</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="cp">#[repr(thin)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Element</span>: <span class="nc">Node</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">ElementFields</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="o">..</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>and so forth.</p>
<h3 id="synthesis-extended-enums-as-a-superset-of-thin-traits">Synthesis: Extended Enums as a superset of Thin Traits</h3>
<p>The Thin Traits proposal addresses common fields by creating explicit
structs, like <code>NodeFields</code>, that serve as containers for the common
fields, and by adding struct inheritance. This is an alternative to
the special <code>Node::struct</code> we used in the Extended Enums
proposal. There are pros and cons to using struct inheritance over
<code>Node::struct</code>. On the pro side, struct inheritance sidesteps the
various questions about privacy, visibility, and so forth that arose
with <code>Node::struct</code>. On the con side, using structs requires a kind of
parallel hierarchy, which is something we were initially trying to
avoid. A final advantage for using struct inheritance is that it is a
&ldquo;reusable&rdquo; mechanism.  That is, whereas adding common fields to enums
only affects enums, using struct inheritance allows us to add common
fields to enums, traits, and other structs. Considering all of these
things, it seems like struct inheritance is a better choice.</p>
<p>If we were to convert the DOM example to use struct inheritance, it
would mean that an enum may inherit from a struct, in which case it
gets the fields of that struct. For out-of-line enum declarations,
then, we can simply create an enum with an empty body:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">NodeFields</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">position</span>: <span class="nc">Rectangle</span><span class="p">,</span><span class="w"> </span><span class="c1">// &lt;-- common fields, but no variants
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="cp">#[repr(unsized)]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">enum</span> <span class="nc">Node</span>: <span class="nc">NodeFields</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">ElementFields</span>: <span class="nc">NodeFields</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="o">..</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">enum</span> <span class="nc">Element</span>: <span class="nc">Node</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">ElementFields</span><span class="p">;</span><span class="w">
</span></span></span></code></pre></div><p>(I&rsquo;ve also taken the liberty of changing from the <code>unsized</code> keyword to
an annotation, <code>#[repr(unsized)]</code>. Given that making an enum <code>unsized</code>
doesn&rsquo;t really affect its semantics, just the memory layout, using a
<code>#[repr]</code> attribute seems like a good choice. It was something we
considered before; I&rsquo;m not really sure why we rejected it anymore.)</p>
<h3 id="method-dispatch">Method dispatch</h3>
<p>My post did not cover how virtual method dispatch was going to work.
Aaron gave a <a href="http://aturon.github.io/blog/2015/09/18/reuse/#ending-2:-the-enum-based-approach">quick summary in the Thin Trait proposal</a>.  I
will give an even quicker one here. It was a goal of the proposal that
one should be able to use inheritance to refine the behavior over the
type hierarchy. That is, one should be able to write a set of impls
like the following:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">MyTrait</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">default</span> <span class="k">fn</span><span class="w"> </span><span class="n">method1</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">default</span> <span class="k">fn</span><span class="w"> </span><span class="n">method2</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">default</span> <span class="k">fn</span><span class="w"> </span><span class="n">method3</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">MyTrait</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Option</span>::<span class="nb">Some</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">method1</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="cm">/* overrides the version above */</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">method3</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="cm">/* must be implemented */</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">MyTrait</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Option</span>::<span class="nb">None</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">method2</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="cm">/* overrides the version above */</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">method3</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="cm">/* must be implemented */</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This still seems like a very nice feature to me. As the Thin Traits
proposal showed, specialization makes this kind of refinement
possible, but it requires a variety of different impls. The example
above, however, didn&rsquo;t have quite so many impls &ndash; why is that?</p>
<p>What we had envisioned to bridge the gap was that we would use a kind
of implicit sugar. That is, the impl for <code>Option&lt;T&gt;</code> would effectively
be expanded to two impls. One of them, the partial impl, provides the
defaults for the variants, and other, a concrete impl, effectively
implements the virtual dispatch, by matching and dispatching to the
appropriate variant:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// As originally envisioned, `impl&lt;T&gt; MyTrait for Option&lt;T&gt;`
</span></span></span><span class="line"><span class="cl"><span class="c1">// would be sugar for the following two impls:
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">partial</span><span class="w"> </span><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">MyTrait</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">default</span> <span class="k">fn</span><span class="w"> </span><span class="n">method1</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">default</span> <span class="k">fn</span><span class="w"> </span><span class="n">method2</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">default</span> <span class="k">fn</span><span class="w"> </span><span class="n">method3</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">MyTrait</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">method1</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">match</span><span class="w"> </span><span class="bp">self</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">this</span><span class="w"> </span><span class="o">@</span><span class="w"> </span><span class="o">&amp;</span><span class="nb">Some</span><span class="p">(</span><span class="o">..</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="nb">Option</span>::<span class="nb">Some</span>::<span class="n">method1</span><span class="p">(</span><span class="n">this</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">this</span><span class="w"> </span><span class="o">@</span><span class="w"> </span><span class="o">&amp;</span><span class="nb">None</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="nb">Option</span>::<span class="nb">None</span>::<span class="n">method1</span><span class="p">(</span><span class="n">this</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="c1">// as above, but for the other methods
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Similar expansions are needed for inherent impls. You may be wondering
<em>why</em> it is that we expand the one impl (for <code>Option&lt;T&gt;</code>) into two
impls in the first place. Each plays a distinct role:</p>
<ul>
<li>The <code>partial impl</code> handles the defaults part of the picture. That
is, it supplies default impls for the various methods that impls for
<code>Some</code> and <code>None</code> can reuse (or override).</li>
<li>The <code>impl</code> itself handles the &ldquo;virtual&rdquo; dispatch part of things.  We
want to ensure that when we call <code>method1()</code> on a variable <code>o</code> of
type <code>Option&lt;T&gt;</code>, we invoke the appropriate <code>method1</code> depending on
what variant <code>o</code> actually is at runtime. We do this by matching on
<code>o</code> and then delegating to the proper place. If you think about it,
this is roughly equivalent to loading a function pointer out of a
vtable and dispatching through that, though the performance
characteristics are interesting (in a way, it resembles a fully
expanded builtin <a href="https://en.wikipedia.org/wiki/Inline_caching#Polymorphic_inline_caching">PIC</a>).</li>
</ul>
<p>Overall, this kind of expansion is a bit subtle. It&rsquo;d be nice to have
a model that did not require it. In fact, in an earlier design, we DID
avoid it. We did so by introducing a new shorthand, called <code>match impl</code>. This would basically create the &ldquo;downcasting&rdquo; impl that we
added implicitly above. This would make the correct pattern as
follows:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">partial</span><span class="w"> </span><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">MyTrait</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="c1">// &lt;-- this is now partial
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">default</span> <span class="k">fn</span><span class="w"> </span><span class="n">method1</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">default</span> <span class="k">fn</span><span class="w"> </span><span class="n">method2</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">default</span> <span class="k">fn</span><span class="w"> </span><span class="n">method3</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">match</span><span class="w"> </span><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">MyTrait</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Option</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">;</span><span class="w"> </span><span class="c1">// &lt;-- this is new
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">MyTrait</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Option</span>::<span class="nb">Some</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">method1</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="cm">/* overrides the version above */</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">method3</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="cm">/* must be implemented */</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">MyTrait</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Option</span>::<span class="nb">None</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">method2</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="cm">/* overrides the version above */</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">method3</span><span class="p">()</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="cm">/* must be implemented */</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>At first glance, this bears a strong resemblance to how the Thin Trait
proposal handled virtual dispatch. In the Thin Trait proposal, we have
a <code>partial impl</code> as well, and then concrete impls that override the
details. However, there is no <code>match impl</code> in Thin Trait proposal. It
is not needed because, in that proposal, we were implementing the
<code>Node</code> trait for the <code>Node</code> type &ndash; and in fact the compiler supplies
that impl automatically, as part of the <a href="http://huonw.github.io/blog/2015/01/object-safety/">object safety</a> notion.</p>
<h4 id="expression-problem-i-know-thee-wella-serviceable-villain">Expression problem, I know thee well&mdash;a serviceable villain</h4>
<p>But there is another difference between the two examples, and it&rsquo;s
important. In this code I am showing above, there is in fact no
connection between <code>MyTrait</code> and <code>Option</code>. That is, under the Extended
Enums proposal, I can implement foreign traits and use inheritance to
refine the behavior depending on what variant I have.  The Thin Traits
pattern, however, only works for implementing the &ldquo;main&rdquo; traits (e.g.,
<code>Node</code>, <code>Element</code>, etc) &ndash; and the reason why is because you can&rsquo;t
write &ldquo;match impls&rdquo; under the Thin Traits proposal, since the set of
types is open-ended. (Instead we lean on the compiler-generated
virtual impl of <code>Node</code> for <code>Node</code>, etc.)</p>
<p>What you <em>can</em> do in the Thin Traits proposal is to add methods to the
main traits and just delegate to those. So I could do something like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">MyTrait</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">my_method</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">Node</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">my_trait_my_method</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">MyTrait</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Node</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">my_method</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="c1">// delegate to the method in the `Node` trait
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">my_trait_my_method</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now you can use inheritance to refine the behavior of
<code>my_trait_my_method</code> if you like. But note that this only works if the
<code>MyTrait</code> type is in the same crate as <code>Node</code> or some ancestor crate.</p>
<p>The reason for this split is precisely the open-ended nature of the
Thin Trait pattern. Or, to give this another name, it is the famous
<a href="https://en.wikipedia.org/wiki/Expression_problem">expression problem</a>. With Extensible Enums, we enumerated all the
cases, so that means that other, downstream crates, can now implement
traits against those cases. We&rsquo;ve fixed the set of cases, but we can
extended infinitely the set of operations. In contrast, with Thin
Traits, we enumerated the operations (as the contents of the master
traits), but we allow downstream crates to implement new cases for
those operations.</p>
<p>So method dispatch proves to be pretty interesting:</p>
<ul>
<li>It gives further evidence that Extensible Enums represent a useful
entity in their own right.</li>
<li>It seems like a case where we may find that the tradeoffs change
over time. That is, maybe <code>match impl</code> is not such a bad solution
after all, particularly if the Thin Trait pattern is covering some
share of the &ldquo;object-like&rdquo; use cases. In which case one of the main
bits of &ldquo;magic&rdquo; in the Extensible Enums proposal goes away.</li>
</ul>
<h3 id="conclusion-1">Conclusion</h3>
<p>Oh, wait, I already gave it. Well, the most salient points are:</p>
<ul>
<li>Extensible Enums are about a fixed set of cases, open-ended set of
operations. Thin Traits are not. This matters.</li>
<li>Thin Traits are (almost) a &ldquo;latent pattern&rdquo; in the Extensible Enums
proposal, requiring only <code>#[repr(thin)]</code> and struct inheritance.
<ul>
<li>Struct inheritance might be nicer than associated structs anyway.</li>
</ul>
</li>
<li>We could consider doing both, and if so, it would probably make
sense to implement Specialization, then Thin Traits, and only then
consider Extensible Enums.</li>
</ul>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/></entry><entry><title type="html">LALRPOP</title><link href="https://smallcultfollowing.com/babysteps/blog/2015/09/14/lalrpop/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2015/09/14/lalrpop/</id><published>2015-09-14T00:00:00+00:00</published><updated>2015-09-14T05:49:14-04:00</updated><content type="html"><![CDATA[<p>Around four years ago, when I had first decided to start at Mozilla
research, I had planned to write an LR(1) parser generator. It seemed
like a good way to get to know Rust. However, I found that newborns
actually occupy somewhat more time than anticipated (read: I was lucky
to squeeze in a shower), and hence that never came to pass.</p>
<p>Well, I&rsquo;m happy to say that, four years later, I&rsquo;ve finally rectified
that. For a few months now I&rsquo;ve been working on a side project while I
have my morning coffee: <a href="https://github.com/nikomatsakis/lalrpop">LALRPOP</a> (pronounced like some sort of
strangely accented version of &ldquo;lollypop&rdquo;). <a href="https://github.com/nikomatsakis/lalrpop">LALRPOP</a> is an LR(1)
parser generator that emits Rust code. It is designed for ease of use,
so it includes a number of features that many parser generators are
missing:</p>
<ul>
<li>Regular-expression-like notation, so you can write <code>Id*</code> for &ldquo;any
number of <code>Id</code>&rdquo; or <code>Id?</code> for &ldquo;an optional <code>Id</code>&rdquo;.</li>
<li>User-defined macros, so you can <a href="https://github.com/nikomatsakis/lalrpop/blob/master/doc/tutorial.md#calculator5">make a macro</a> like <code>Comma&lt;Id&gt;</code> that
means &ldquo;comma separated list of <code>Id</code> with optional trailing comma&rdquo;.</li>
<li>Conditional macros, so you can easily generate a subset of your
grammar for some particular context (like, all expressions that
don&rsquo;t end in <code>{</code>).</li>
<li>Support for synthesizing tokenizers (currently somewhat limited, but
sufficient for many uses) as well as external tokenizers (very
flexible). If you&rsquo;re using an external tokenizer, you don&rsquo;t even
need to be parsing input strings at all really, any iterator of
&ldquo;matchable values&rdquo; will do.</li>
<li>Easy to get access to positional information.</li>
<li>Easy to write fallible rules where the action code can generate a
parse error.</li>
</ul>
<p>If you&rsquo;d like to learn more about LALRPOP, I recently started a
<a href="https://github.com/nikomatsakis/lalrpop/blob/master/doc/tutorial.md">tutorial</a> that introduces LALRPOP in more depth and walks through
most of its features. The tutorial doesn&rsquo;t cover everything yet, but
I&rsquo;ll try to close the gaps.</p>
<p><strong>Why LR(1)?</strong> After all, aren&rsquo;t LR(1) generators kind of annoying,
what with those weird shift/reduce errors? Well, after teaching
compiler design for so many years, I think I may have developed
Stockholm syndrome &ndash; I kind of enjoy diagnosing and solving
shift/reduce failures. ;) But more seriously, I personally like that
once I get my grammar working with an LR(1) generator, I know that it
is unambiguous and will basically work. When I&rsquo;ve used PEG generators,
I usually find that they work great in the beginning, but once in a
while they will just mysteriously fail to parse something, and
figuring out why is a horrible pain. This is why with LALRPOP I&rsquo;ve
tried to take the approach of adding tools to make handling
shift/reduce errors relatively easy &ndash; basically automating the
workarounds that one typically has to do by hand.</p>
<p><em>That said,</em> eventually I would like LALRPOP to support a bunch of
algorithms. In particular, I plan to add something that can handle
universal CFGs, though other deterministic techniques, like <code>LL(k)</code>,
would be nice as well.</p>
<p><strong>Performance.</strong> Another advantage of LR(1), of course, it that it
offers linear performance. That said, I&rsquo;ve found that in
practice, parsing based on a parsing table is not particularly
speedy. If you think about it, it&rsquo;s more-or-less interpreting your
grammar &ndash; you&rsquo;ve basically got a small loop that&rsquo;s loading data from
a table and then doing an indirect jump based on the results, which
happen to be the two operations that CPUs like least. In my
experience, rewriting to use a recursive descent parser is often much
faster.</p>
<p>LALRPOP takes a different approach. The idea is that instead of a
parsing table, we generate a function for every state. This ought to
be quite speedy; it also plays very nicely with Rust&rsquo;s type system,
since types in Rust don&rsquo;t have uniform size, and using distinct
functions lets us store the stack in local variables, rather than
using a <code>Vec</code>. At first, I thought maybe I had invented something new
with this style of parsing, but of course I should have known better:
a little research revealed that this technique is called
<a href="https://en.wikipedia.org/wiki/Recursive_ascent_parser"><em>recursive ascent</em></a>.</p>
<p>Now, as expected, recursive ascent is <em>supposed</em> to be quite fast. In
fact, I was hoping to unveil some fantastic performance numbers with
this post, but I&rsquo;ve not had time to try to create a fair benchmark, so
I can&rsquo;t &ndash; since I haven&rsquo;t done any measurements, LALRPOP&rsquo;s generated
code may in fact be quite slow. I just don&rsquo;t know. Hopefully I&rsquo;ll find
some time to rectify that in the near future.</p>
<p><strong>100% stable Rust.</strong> It&rsquo;s probably worth pointing out that LALRPOP is
100% stable Rust, and I&rsquo;m committed to keeping it that way.</p>
<p><strong>Other parser generators.</strong> Should LALRPOP or LR(1) not be too your
fancy, I just want to point out that the Rust ecosystem has grown
quite a number of parser combinator and PEG libraries: <a href="https://crates.io/crates/nom">nom</a>, <a href="https://crates.io/crates/oak">oak</a>,
<a href="https://crates.io/crates/peg">peg</a>, <a href="https://crates.io/crates/nailgun">nailgun</a>, <a href="https://crates.io/crates/peggler">peggler</a>, <a href="https://crates.io/crates/combine">combine</a>, <a href="https://crates.io/crates/parser-combinators">parser-combinators</a>, and of
course my own <a href="https://crates.io/crates/rusty-peg">rusty-peg</a> (and probably some others I&rsquo;ve missed,
sorry). <del>I&rsquo;m not aware of any other LR(1) (or GLL, GLR, etc)
generators for Rust, but there may well be some.</del> There are also two
LALR parser generators for Rust you may want to check out, <a href="https://github.com/sivadeilra/racc">racc</a> and
and <a href="https://github.com/rodrigorc/lemon_rust">lemon_rust</a>.</p>
<p><strong>Future plans.</strong> I&rsquo;ve got some plans for LALRPOP. There are a host of
new features I&rsquo;d like to add, with the aim of eliminating more
boilerplate. I&rsquo;d also like to explore adding new parser algorithms,
particularly universal algorithms that can handle any CFG grammar,
such as GLL, GLR, or LL(*). Finally, I&rsquo;m really interesting in
exploring the area of error recovery, and in particular techniques to
find the minimum damaged area of a parse tree if there is an incorrect
parse. (There is of course tons of existing work here.)</p>
<p><strong>Plea for help.</strong> Of course, if I wind up doing all of this myself,
it might take quite some time. So if you&rsquo;re interested in working on
LALRPOP, I&rsquo;d love to hear from you! I&rsquo;d also love to hear some other
suggestions for things to do with LALRPOP. One of the things I plan to
do over the next few weeks is also spend some more time writing up
plans in LALRPOP&rsquo;s issue database, as well as filling out its wiki.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/></entry><entry><title type="html">Virtual Structs Part 3: Bringing Enums and Structs Together</title><link href="https://smallcultfollowing.com/babysteps/blog/2015/08/20/virtual-structs-part-3-bringing-enums-and-structs-together/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2015/08/20/virtual-structs-part-3-bringing-enums-and-structs-together/</id><published>2015-08-20T00:00:00+00:00</published><updated>2015-08-20T09:29:45-04:00</updated><content type="html"><![CDATA[<p>So, in <a href="http://smallcultfollowing.com/babysteps/blog/2015/05/05/where-rusts-enum-shines/">previous</a> <a href="http://smallcultfollowing.com/babysteps/blog/2015/05/29/classes-strike-back/">posts</a>, I discussed the pros and cons of two different
approaches to modeling variants: Rust-style enums and C++-style
classes. In those posts, I explained why I see Rust enums and OO-style
class hierarchies as more alike than different (I personally credit
Scala for opening my eyes to this, though I&rsquo;m sure it&rsquo;s been
understood by others for much longer). The key points were as follows:</p>
<ul>
<li>Both Rust-style enums and C++-style classes can be used to model the
idea of a value that be one of many variants, but there are
differences in how they work at runtime. These differences mean that
Rust-style enums are more convenient for some tasks, and C++-style
classes for others. In particular:
<ul>
<li>A Rust-style enum is sized as large as the largest variant. This is
great because you can lay them out flat in another data structure
without requiring any allocation. You can also easily change from
one variant to another. One downside of Rust enums is that you cannot
&ldquo;refine&rdquo; them to narrow the set of variants that a particular value
can have.</li>
<li>A C++-style class is sized to be exactly as big as one variant. This
is great because it can be much more memory efficient. However, if
you don&rsquo;t know what variant you have, you must manipulate the value
by pointer, so it tends to require more allocation. It is also
impossible to change from one variant to another. Class hierarchies
also give you a simple, easily understood kind of refinement, and
the ability to have common fields that are shared between variants.</li>
</ul>
</li>
<li>C++-style classes offer constructors, which allows for more
abstraction and code reuse when initially creating an instance, but
raise thorny questions about the type of a value under construction;
Rust structs and enums are always built in a single-shot today,
which is simpler and safer but doesn&rsquo;t compose as well.</li>
</ul>
<p>What I want to talk about in this post is a proposal (or
proto-proposal) for bridging those two worlds in Rust. I&rsquo;m going to
focus on data layout in this post. I&rsquo;ll defer virtual methods for
another post (or perhaps an RFC). <em>Spoiler alert:</em> they can be viewed
as a special case of <a href="https://github.com/rust-lang/rfcs/pull/1210">specialization</a>.</p>
<p>I had originally intended to publish this post a few days after the
others. Obviously, I got delayed. Sorry about that!  Things have been
very busy! In any case, better late than never, as
some-great-relative-or-other always (no doubt) said. Truth is, I
really miss blogging regularly, so I&rsquo;m going to make an effort to
write up more &ldquo;in progress&rdquo; and half-baked ideas (yeah yeah, promises
to blog more are a dime a dozen, I know).</p>
<p><em>Note:</em> I want to be clear that the designs in this blog post are not
&ldquo;my&rdquo; work per se. Some of the ideas originated with me, but others
have arisen in the course of conversations with others, as well as
earlier proposals from nrc, which in turn were heavily based on
community feedback. And of course it&rsquo;s not like we Rust folk invented
OO or algebraic data types or anything in the first place. :)</p>
<!-- more -->
<h3 id="unifying-structs-and-enums-into-type-hierarchies">Unifying structs and enums into type hierarchies</h3>
<p>The key idea is to generalize enums and structs into a single concept.
This is often called an <em>algebraic data type</em>, but &ldquo;algebra&rdquo; brings
back memories of balancing equations in middle school (not altogether
unpleasant ones, admittedly), so I&rsquo;m going to use the term <em>type
hierarchy</em> instead. Anyway, to see what I mean, let&rsquo;s look at my
favorite enum ever, <code>Option</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">enum</span> <span class="nb">Option</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nb">Some</span><span class="p">(</span><span class="n">T</span><span class="p">),</span><span class="w"> </span><span class="nb">None</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The idea is to reinterpret this enum as three types arranged into a
tree or hierarchy. An important point is that every node in the tree
is now a type: so there is a type representing the <code>Some</code> variant, and
a type representing the <code>None</code> variant:</p>
<pre tabindex="0"><code>enum Option&lt;T&gt;
|
+- struct None&lt;T&gt;
+- struct Some&lt;T&gt;
</code></pre><p>As you can see, the leaves of the tree are called structs. They
represent a particular variant. The inner nodes are called enums, and
they represent a set of variants. Every existing <code>struct</code> definition
can also be reinterpreted as a hierarchy, but just a hierarchy of size
1.</p>
<p>These generalized type hierarchies can be any depth. This means you
can do nested enums, like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">enum</span> <span class="nc">Mode</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">enum</span> <span class="nc">ByRef</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">Mutable</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">Immutable</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">ByValue</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This creates a nested hierarchy:</p>
<pre tabindex="0"><code>enum Mode
|
+- enum ByRef
|  |
|  +- struct Mutable
|  +- struct Immutable
+- ByValue
</code></pre><p>Since all the nodes in a hiearchy are types, we get refinement types
for free. This means that I can use <code>Mode</code> as a type to mean &ldquo;any mod
at all&rdquo;, or <code>Mode::ByRef</code> for the times when I know something is one
of the <code>ByRef</code> modes, or even <code>Mode::ByRef::Mutable</code> (which is a
singleton struct).</p>
<p>As part of this change, it should be possible to declare the variants
out of line.  For example, we could change enum to look as follows:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">enum</span> <span class="nb">Option</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nb">Some</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">value</span>: <span class="nc">T</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nb">None</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This definitely is not exactly equivalent to the older one, of course.
The names <code>Some</code> and <code>None</code> live alongside <code>Option</code>, rather than
within it, and I&rsquo;ve used a field (<code>value</code>) rather than a tuple struct.</p>
<h3 id="common-fields">Common fields</h3>
<p>Enum declarations are extended with the ability to have fields as well
as variants. These fields are inherited by all variants of that enum.
In the syntax, fields must appear before the variants, and it is also
not possible to combine &ldquo;tuple-like&rdquo; structs with inherited fields.</p>
<p>Let&rsquo;s revisit an example from <a href="http://smallcultfollowing.com/babysteps/blog/2015/05/29/classes-strike-back/">the previous post</a>. In the compiler,
we currently represent types with an enum. However, there are certain
fields that every type carries. These are handled via a separate struct,
so that we wind up with something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">type</span> <span class="nc">Ty</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;tcx</span><span class="w"> </span><span class="n">TypeData</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">TypeData</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">id</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">flags</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">structure</span>: <span class="nc">TypeStructure</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">enum</span> <span class="nc">TypeStructure</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Int</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Uint</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Ref</span><span class="p">(</span><span class="n">Ty</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="p">),</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Under this newer design, we could simply include the common fields in the
enum definition:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">type</span> <span class="nc">Ty</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;tcx</span><span class="w"> </span><span class="n">TypeData</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">enum</span> <span class="nc">TypeData</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Common fields:
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">id</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">flags</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Variants:
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Int</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">},</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Uint</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">},</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Ref</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">referent_ty</span>: <span class="nc">Ty</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="w"> </span><span class="p">},</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Naturally, when I create a <code>TypeData</code> I should supply all the fields,
including the inherited ones (though in a later section I&rsquo;ll present
ways to extract the initialization of common fields into a reusable
fn):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">ref</span><span class="w"> </span><span class="o">=</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">TypeData</span>::<span class="n">Ref</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">id</span>: <span class="nc">id</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">flags</span>: <span class="nc">flags</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">referent_ty</span>: <span class="nc">some_ty</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">};</span><span class="w">
</span></span></span></code></pre></div><p>And, of course, given a reference <code>&amp;TypeData&lt;'tcx&gt;</code>, we can access these common
fields:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">print_id</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="p">(</span><span class="n">t</span>: <span class="kp">&amp;</span><span class="nc">TypeData</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="fm">println!</span><span class="p">(</span><span class="s">&#34;The id of `</span><span class="si">{:?}</span><span class="s">` is `</span><span class="si">{:?}</span><span class="s">`&#34;</span><span class="p">,</span><span class="w"> </span><span class="n">t</span><span class="p">,</span><span class="w"> </span><span class="n">t</span><span class="p">.</span><span class="n">id</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Convenient!</p>
<h3 id="unsized-enums">Unsized enums</h3>
<p>As today, the size of an enum type, by default, is equal to the
largest of its variants. However, as I&rsquo;ve outlined in the last two
posts, it is often useful to have each value be sized to a particular
variant. In the previous posts I identified some criteria for when
this is the case:</p>
<p><em>One interesting question is whether we can concisely state
conditions in which one would prefer to have “precise variant sizes”
(class-like) vs “largest variant” (enum). I think the “precise
sizes” approach is better when the following apply:</em></p>
<ul>
<li><em>A recursive type (like a tree), which tends to force boxing
anyhow. Examples: the AST or types in the compiler, DOM in servo, a
GUI.</em></li>
<li><em>Instances never change what variant they are.</em></li>
<li><em>Potentially wide variance in the sizes of the variants.</em></li>
</ul>
<p>Therefore, it is possible to declare the root enum in a type hierarchy
as either sized (the default) or <em>unsized</em>; this choice is inherited
by all enums in the hierarchy. If the hierarchy is declared as
unsized, it means that <strong>each struct type will be sized just as big as
it needs to be</strong>.  This means in turn that the <strong>enum types in the
hierarchy are unsized types</strong>, since the space required will vary
depending on what variant an instance happens to be at runtime.</p>
<p>To continue with our example of types in rustc, we currently go
through some contortions so as to introduce indirection for uncommon
cases, which keeps the size of the enum under control:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">type</span> <span class="nc">Ty</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;tcx</span><span class="w"> </span><span class="n">TypeData</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">enum</span> <span class="nc">TypeData</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// The data for a fn type is stored in a different struct
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// which is cached in a special arena. This is helpful
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// because (a) the size of this variant is only a single word
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// and (b) if we have a type that we know is a fn pointer,
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// we can pass the `BareFnTy` struct around instead of the
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// `TypeData`.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">FnPointer</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">data</span>: <span class="kp">&amp;</span><span class="na">&#39;tcx</span> <span class="nc">FnPointerData</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="w"> </span><span class="p">},</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">FnPointerData</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">unsafety</span>: <span class="nc">Unsafety</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">abi</span>: <span class="nc">Abi</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">signature</span>: <span class="nc">Signature</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>As discussed in a comment in the code, the current scheme also serves
as a poor man&rsquo;s refinement type: if at some point in the code we know
we have a fn pointer, we can write a function that takes a
<code>FnPointerData</code> argument to express that:</p>
<pre tabindex="0"><code>fn process_ty&lt;&#39;tcx&gt;(ty: Ty&lt;&#39;tcx&gt;) {
    match ty {
        &amp;TypeData::FnPointer { data, .. } =&gt; {
            process_fn_ty(ty, data)
        }
        ...
    }
}

// This function expects that `ty` is a fn pointer type. The `FnPointerData`
// contains the fn pointer information for `ty`.
fn process_fn_ty&lt;&#39;tcx&gt;(ty: Ty&lt;&#39;tcx&gt;, data: &amp;FnPointerData&lt;&#39;tcx&gt;) {
}
</code></pre><p>This pattern works OK in practice, but it is not perfect. For one
thing, it&rsquo;s tedious to construct, and it&rsquo;s also a little
inefficient. It introduces unnecessary indirection and a second memory
arena. Moreover, the refinement type scheme isn&rsquo;t great, because you
often have to pass both the <code>ty</code> (for the common fields) and the
internal <code>data</code>.</p>
<p>Using a type hierarchy, we can do much better. We simply remove the
<code>FnPointerData</code> struct and inline its fields directly into <code>TypeData</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">type</span> <span class="nc">Ty</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;tcx</span><span class="w"> </span><span class="n">TypeData</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kr">unsized</span><span class="w"> </span><span class="k">enum</span> <span class="nc">TypeData</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// No indirection anymore. What&#39;s more, the type `FnPointer`
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// serves as a refinement type automatically.
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">FnPointer</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">unsafety</span>: <span class="nc">Unsafety</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">abi</span>: <span class="nc">Abi</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">signature</span>: <span class="nc">Signature</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now we can write functions that process specific categories of types
very naturally:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">process_ty</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="p">(</span><span class="n">ty</span>: <span class="nc">Ty</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">match</span><span class="w"> </span><span class="n">ty</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">fn_ty</span><span class="w"> </span><span class="o">@</span><span class="w"> </span><span class="o">&amp;</span><span class="n">TypeData</span>::<span class="n">FnPointer</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">process_fn_ty</span><span class="p">(</span><span class="n">fn_ty</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c1">// Don&#39;t even need a comment: it&#39;s obvious that `ty` should be a fn type
</span></span></span><span class="line"><span class="cl"><span class="c1">// (and enforced by the type system).
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">process_fn_ty</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="p">(</span><span class="n">ty</span>: <span class="kp">&amp;</span><span class="nc">TypeData</span>::<span class="n">FnPointer</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="matching-as-downcasting">Matching as downcasting</h3>
<p>As the previous example showed, one can continue to use match to select
the variant from an enum (sized or not). Maching also gives us an
elegant downcasting mechanism. Instead of writing <code>(Type) value</code>, as
in Java, or <code>dynamic_cast&lt;Type&gt;(value)</code>, one writes <code>match value</code> and
handles the resulting cases. Just as with enums today, <code>if let</code> can be
used if you just want to handle a single case.</p>
<h3 id="crate-locality">Crate locality</h3>
<p>An important part of the design is that the entire type hierarchy must
be declared <strong>within a single crate</strong>. This is of course trivially
true today: all variants of an enums are declared in one item, and
structs correspond to singleton hierarchies.</p>
<p>Limiting the hierarchy to a single crate has a lot of advantages.
Without it, you simply can&rsquo;t support today&rsquo;s &ldquo;sized&rdquo; enums, for one
thing. It allows us to continue doing exhaustive checks for matches
and to generate more efficient code. It is interesting to compare to
<code>dynamic_cast</code>, the C++ equivalent to a match:</p>
<ul>
<li><code>dynamic_cast</code> is often viewed as a kind of code smell, versus a
virtual method. I&rsquo;m inclined to agree, as <code>dynamic_cast</code> only checks
for a particular variant, rather than specifying handling for the
full range of variants; this makes it fragile in the face of edits
to the code. In contrast, the exhaustive nature of a Rust <code>match</code>
ensures that you handle every case (of course, one must still be
judicious in your use of <code>_</code> patterns, which, while convenient, can
be a refactoring hazard).</li>
<li><code>dynamic_cast</code> is somewhat inefficient, since it must handle the
fully general case of classes that spread across compilation units;
in fact, it is very uncommon to have a class hierarchy that is truly
extensible &ndash; and in such cases, using <code>dynamic_cast</code> is
particularly hard to justify. This leads to projects like LLVM
<a href="http://llvm.org/docs/CodingStandards.html#do-not-use-rtti-or-exceptions">reimplementing RTTI (the C++ name for matching) from scratch</a>.</li>
</ul>
<p>Another advantage of confining the hierarchy to a single crate is that
it allows us to continue doing variance inference across the entire
hierarchy at once. This means that, for example, that in the &ldquo;out of
line&rdquo; version of <code>Option</code> (below) we can infer a variance for the
parameter <code>T</code> declared on <code>Option</code>, in the same way we do today
(otherwise, the declaration of <code>enum Option&lt;T&gt;</code> would require some
form of phantom data, and that would be <em>binding</em> on the types
declared in other crates).</p>
<p>I also find that confining the hierarchy to a single crate helps to
clarify the role of type hierarchies versus traits and, in turn, avoid
some of the pitfalls so beloved by OO haters. Basically, it means that
if you want to define an open-ended extension point, you must use a
trait, which also offers the most flexibility; a type hierarchy, like
an enum today, can only be used to offer a choice between a fixed
number of crate-local types. An analogous situation in Java would be
deciding between an abstract base class and an interface; under this
design, you would have to use an interface (note that the problem of
code reuse can be tackled separately, [via specialization]).</p>
<p><em>Finally,</em> confining extension to a trait is relevant to the
construction of vtables and handling of specialization, but we&rsquo;ll dive
into that another time.</p>
<p>Even though I think that limiting type hierarchies to a single crate
is very helpful, it&rsquo;s worth pointing out that it IS possible to lift
this restriction if we so choose. This can&rsquo;t be done in all cases,
though, due to some of the inherent limitations involved.</p>
<h3 id="enum-types-as-bounds">Enum types as bounds</h3>
<p>In the previous section, I mentioned that enums and traits (both today
and in this proposed design) both form a kind of interface. Whereas
traits define a list of methods, enums indicate something about the
memory layout of the value: for example, they can tell you about a
common set of fields (though not the complete set), and they clearly
narrow down the universe of types to be just the relevant variants.
Therefore, it makes sense to be able to use an enum type as a bound on
a type parameter. Let&rsquo;s dive into an example to see what I mean and
why you might want this.</p>
<p>Imagine we&rsquo;re using a type hiererachy to represent the
<a href="https://developer.mozilla.org/en-US/docs/Web/API/Document_Object_Model/Introduction">HTML DOM</a>.  It might look something like this (browser people:
forgive my radical oversimplification):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kr">unsized</span><span class="w"> </span><span class="k">enum</span> <span class="nc">Node</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="c1">// where this node is positioned after layout
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">position</span>: <span class="nc">Rectangle</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">enum</span> <span class="nc">Element</span>: <span class="nc">Node</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">TextElement</span>: <span class="nc">Element</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">ParagraphElement</span>: <span class="nc">Element</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span></code></pre></div><p>Now imagine that I have a helper function that selects nodes based on whether
they intersect a particular box on the screen:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">intersects</span><span class="p">(</span><span class="k">box</span>: <span class="nc">Rectangle</span><span class="p">,</span><span class="w"> </span><span class="n">elements</span>: <span class="kp">&amp;</span><span class="p">[</span><span class="n">Rc</span><span class="o">&lt;</span><span class="n">Node</span><span class="o">&gt;</span><span class="p">])</span><span class="w"> </span>-&gt; <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">Rc</span><span class="o">&lt;</span><span class="n">Node</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">result</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="fm">vec!</span><span class="p">[];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">for</span><span class="w"> </span><span class="n">element</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="n">elements</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">if</span><span class="w"> </span><span class="n">element</span><span class="p">.</span><span class="n">position</span><span class="p">.</span><span class="n">intersects</span><span class="p">(</span><span class="k">box</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">result</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="n">element</span><span class="p">.</span><span class="n">clone</span><span class="p">());</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">result</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>OK, great! But now imagine that I have a slice of text elements
(<code>&amp;[Rc&lt;TextElement&gt;]</code>), and I would like to use this function. I will
get back a <code>Vec&lt;Rc&lt;Node&gt;&gt;</code> &ndash; I&rsquo;ve lost track of the fact that my
input contained only text elements.</p>
<p>Using generics and bounds, I can rewrite the function:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">intersects</span><span class="o">&lt;</span><span class="n">T</span>:<span class="nc">Node</span><span class="o">&gt;</span><span class="p">(</span><span class="k">box</span>: <span class="nc">Rectangle</span><span class="p">,</span><span class="w"> </span><span class="n">elements</span>: <span class="kp">&amp;</span><span class="p">[</span><span class="n">Rc</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">])</span><span class="w"> </span>-&gt; <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">Rc</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// identical to before
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Nothing in the body had to change, only the signature.</p>
<p>Permitting enum types to appear as bounds also means that they can be
referenced by traits as supertraits. This allows you to define
interfaces that cut across the primary inheritance hierarchy. So, for
example, in the DOM both the <code>HTMLTextAreaElement</code> and the
<code>HTMLInputElement</code> can carry a block of text, which implies that they
have a certain set of text-related methods and properties in
common. And of course they are both elements. This can be modeled
using a trait like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">trait</span><span class="w"> </span><span class="n">TextAPIs</span>: <span class="nc">HTMLElement</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">maxLength</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="kt">usize</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This means that if you have an <code>&amp;TextApis</code> object, you can access the
fields from <code>HTMLElement</code> with no overhead, because they are stored in
the same place for both cases. But if you want to access other things,
such as <code>maxLength</code>, that implies virtual dispatch, since the address
is dynamically computed and will vary.</p>
<h4 id="enums-vs-traits">Enums vs traits</h4>
<p>The notion of enums as bounds raises questions about potential overlap
in purpose between enums and traits. I would argue that this overlap
already exists: both enums and traits today are ways to let you write
a single function that operates over values of more than one type.
However, in practice, it&rsquo;s rarely hard to know which one you want to
use. This I think is because they come at the problem from two
different angles:</p>
<ul>
<li>Traits start with the assumption that you want to work with any
type, and let you narrow that. Basically, you get code that is <em>as
general as possible</em>.</li>
<li>In contrast, enums assume you want to work with a fixed set of
types. This means you can write code that is <em>as specific as
possible</em>. Enums also work best when the types you are choosing
between are related into a kind of family, like &ldquo;all the different
variants of types in the Rust language&rdquo; or &ldquo;some and none&rdquo;.</li>
</ul>
<p>If we extend enums in the way described here, then they will become
more capable and convenient, and so you might find that they overlap a
bit more with plausible use cases for traits. However, I think that in
practice there are still clear guidelines for which to choose when:</p>
<ul>
<li>If you have a fixed set of related types, use an enum. Having an
enumerated set of cases is advantageous in a lot of ways: we can
generate faster code, you can write matches, etc.</li>
<li>If you want open-ended extension, use a trait (and/or trait object).
This will ensure that your code makes as few assumptions as possible,
which in turn means that you can handle as many clients as possible.</li>
</ul>
<p>Because enums are tied to a fixed set of cases, they allow us to
generate tighter code, particularly when you are not monomorphizing to
a particular variant.  That is, if you have a value of type
<code>&amp;TypeData</code>, where <code>TypeData</code> is the enum we mentioned before, you can
access common fields at no overhead, even though we don&rsquo;t know what
variant it is. Moreover, the pointer is thin and thus takes only a
single word.</p>
<p>In contrast, if you had made <code>TypeData</code> a trait and hence <code>&amp;TypeData</code>
was a trait object, accessing common fields would require some
overhead.  (This is true even if we were to add &ldquo;virtual fields&rdquo; to
traits, as <a href="https://github.com/rust-lang/rfcs/pull/250">eddyb and kimundi proposed in RFC #250</a>.) Also,
because traits are &ldquo;added on&rdquo; to other values, your pointer would be a
fat pointer, and hence take two words.</p>
<p>(As an aside, I still like the idea of adding virtual fields to
traits.  The idea is that these fields could be &ldquo;remapped&rdquo; in an
implementation to varying offsets. Accessing such a field implies
dynamically loading the offset, which is slower than a regular field
but faster than a virtual call. If we additionally added the
restriction that those fields must access content that is orthogonal
from one another, we might be able to make the borrow checker more
permissive in the field case as well. But that is kind of an
orthogonal extension to what I&rsquo;m talking about here &ndash; and one that
fits well with my framing of &ldquo;traits are for open-ended extension
across heterogeneous types, enums are for a single cohesive type
hierarchy&rdquo;.)</p>
<p><a id="constructors"></a></p>
<h3 id="associated-structs-constructors">Associated structs (constructors)</h3>
<p>One of the distinctive features of OO-style classes is that they
feature constructors. Constructors allow you to layer initialization
code, so that you can build up a function that initializes (say) the
fields for <code>Node</code>, and that function is used as a building block by
one that initializes the <code>Element</code> fields, and so on down the
hierarchy. This is good for code reuse, but constructors have an
Achilles heel: while we are initializing the <code>Node</code> fields, what value
do the <code>Element</code> fields have? In C++, the answer is &ldquo;who knows&rdquo; &ndash; the
fields are simply uninitialized, and accessing them is undefined
behavior. In Java, they are null. But Rust has no such &ldquo;convenient&rdquo;
answer. And there is an even weirder question: what happens when you
downcast or match on a value while it is being constructed?</p>
<p>Rust has always sidestepped these questions by using the functional
language approach, where you construct an aggregate value (like a
struct) by supplying all its data at once. This works good for small
structs, but it doesn&rsquo;t scale up to supporting refinement types and
common fields. Consider the example of types in the compiler:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">enum</span> <span class="nc">TypeData</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="c1">// Common fields:
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">id</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">flags</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">counter</span>: <span class="kt">usize</span><span class="p">,</span><span class="w"> </span><span class="c1">// ok, I&#39;m making this field up :P
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">FnPointer</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">unsafety</span>: <span class="nc">Unsafety</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">abi</span>: <span class="nc">Abi</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">signature</span>: <span class="nc">Signature</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.,</span><span class="w"> </span><span class="c1">// other variants here
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>I would like to be able to write some initialization routines that
compute the <code>id</code>, flags, and whatever else and then reuse those across
different variants. But it&rsquo;s hard to know what such a function should
return:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">init_type_data</span><span class="p">(</span><span class="n">cx</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">Context</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">XXX</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="no">XXX</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">id</span>: <span class="nc">cx</span><span class="p">.</span><span class="n">next_id</span><span class="p">(),</span><span class="w"> </span><span class="n">flags</span>: <span class="nc">DEFAULT_FLAGS</span><span class="p">,</span><span class="w"> </span><span class="n">counter</span>: <span class="mi">0</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>What is this type <code>XXX</code>? What I want is basically a struct with just
the common fields (though of course I don&rsquo;t want to have to define
such a struct mself, too repetitive):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">XXX</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">id</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">flags</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">counter</span>: <span class="kt">usize</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>And of course I also want to be able to use an instance of this struct
in an initializer as part of a <code>..</code> expression, like so:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">make_fn_type</span><span class="p">(</span><span class="n">cx</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="n">Context</span><span class="p">,</span><span class="w"> </span><span class="n">unsafety</span>: <span class="nc">Unsafety</span><span class="p">,</span><span class="w"> </span><span class="n">abi</span>: <span class="nc">Abi</span><span class="p">,</span><span class="w"> </span><span class="n">signature</span>: <span class="nc">Signature</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">TypeData</span>::<span class="n">FnPointer</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">unsafety</span>: <span class="nc">unsafety</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">abi</span>: <span class="nc">abi</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">signature</span>: <span class="nc">signature</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="o">..</span><span class="n">init_type_data</span><span class="p">(</span><span class="n">cx</span><span class="p">)</span><span class="w">   </span><span class="c1">// &lt;-- initializes the common fields 
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>If we had a type like this, it strikes a reasonable nice balance
between the functional and OO styles. We can layer constructors and
build constructor abstractions, but we also don&rsquo;t have a value of type
<code>TypeData</code> until all the fields are initialized. In the interim, we
just have a value of this type <code>XXX</code>, which only has the shared fields
that are common to all variants.</p>
<p>All we need now is a reasonable name for this type <code>XXX</code>. The proposal
is that every enum has an associated struct type called <code>struct</code> (i.e,
the keyword). So instead of <code>XXX</code>, I could write <code>TypeData::struct</code>,
and it means &ldquo;a struct with all the fields common to any <code>TypeData</code>
variant&rdquo;. Note that a <code>TypeData::struct</code> value is <em>not</em> a <code>TypeData</code>
variant; it just has the same data as a variant.</p>
<h3 id="subtyping-and-coercion">Subtyping and coercion</h3>
<p>There is one final wrinkle worth covering in the proposal. And
unfortunately, it&rsquo;s a tricky one. I&rsquo;ve been sort of tacitly assuming
that an enum and its variants have some sort of typing relationship,
but I haven&rsquo;t said explicitly what it is. This part is going to take
some experimentation to find the right mix. But let me share some
intermediate thoughts.</p>
<p><strong>Unsized enums.</strong> For unsized enums, we are always dealing with an
indirection. So e.g.  we have to be able to smoothly convert from a
reference to a specific struct like <code>&amp;TextElement</code> to a reference to a
base enum like <code>&amp;Node</code>.  We&rsquo;ve traditionally viewed this as a special
case of <a href="https://github.com/rust-lang/rfcs/blob/master/text/0982-dst-coercion.md">&ldquo;DST coercions&rdquo;</a>. Basically, coercing to <code>&amp;Node</code> is
more-or-less exactly like coercion to a trait object, except that we
don&rsquo;t in fact need to attach a vtable &ndash; that is, the &ldquo;extra data&rdquo; on
the <code>&amp;Node</code> fat pointer is just <code>()</code>. But in fact we don&rsquo;t necessarily
HAVE to view upcasting like this as a coercion &ndash; after all, there is
no runtime change happening here.</p>
<p>This gets at an interesting point. Subtyping between OO classes is
normally actually subtyping between <em>references</em>. That is, in Java we
say that <code>String &lt;: Object</code>, but that is because everything in Java is
in fact a reference. In C++, not everything is a reference, so if you
aren&rsquo;t careful this in fact gives rise to creepy hazards like
<a href="http://stackoverflow.com/questions/274626/what-is-object-slicing">object slicing</a>. The problem here is that in C++ the superclass
type is really just the superclass fields; so if you do <code>superclass = subclass</code>, then you are just going to drop the extra fields from the
subclass on the floor (usually). This probably isn&rsquo;t what you meant to
do.</p>
<p>Because of unsized types, though, Rust can safely say that a struct
type is a subtype of its containing enum(s). So, in the DOM example,
we could say that <code>TextElement &lt;: Node</code>. We don&rsquo;t have to fear slicing
because the type <code>TextElement</code> is unsized, and hence the user could
only ever make use of it by ref. In other words, object slicing arises
C++ precisely because it doesn&rsquo;t have a notion of unsized types.</p>
<p><strong>Sized enums.</strong> To be honest, unsized enums are not the scary case,
because they are basically a new feature to the language. The harder
and more interesting case is sized enums. The problem here is that we
are introducing new types into existing code, and we want to be sure
not to break things. So consider this example:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="kd">let</span><span class="w"> </span><span class="k">mut</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">None</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">Some</span><span class="p">(</span><span class="mi">3</span><span class="p">);</span><span class="w">
</span></span></span></code></pre></div><p>In today&rsquo;s world, the first assignment gives <code>x</code> a type of
<code>Option&lt;_&gt;</code>, where the <code>_</code> represents something to be inferred
later. This is because the expression <code>None</code> has type <code>Option&lt;_&gt;</code>. But
under this RFC, the type of <code>None</code> is <code>None&lt;_&gt;</code> &ndash; and hence we have
to be smart enough to infer that the type of <code>x</code> should not be
<code>None&lt;_&gt;</code> but rather <code>Option&lt;_&gt;</code> (because it is later assigned a
<code>Some&lt;_&gt;</code> value).</p>
<p>This kind of inference, where the type of a variable changes based on
the full set of values assigned to it, is traditionally what we have
called &ldquo;subtyping&rdquo; in the Rust compiler. (In contrast, coercion is an
instantaneous decision that the compiler makes based on the types it
knows thus far.) This is sort of technical minutia in how the compiler
works, but of course it impacts the places in Rust that you need type
annoations.</p>
<p>Now, to some extent, we already have this problem. There are known
cases today where coercions don&rsquo;t work as well as we would like. The
proposed <code>box</code> syntax, for example, suffers from this a bit, as do
other patterns.  We&rsquo;re investing ways to make the compiler smarter,
and it may be that we can combine all of this into a more intelligent
inference infrastructure.</p>
<p><strong>Variance and mutable references.</strong> It&rsquo;s worth pointing out that
we&rsquo;ll always need some sort of coercion support, because subtyping
alone doesn&rsquo;t allow one to convert between mutable references. In
other words, <code>&amp;mut TextElement</code> is not a subtype of <code>&amp;mut Node</code>, but
we do need to be able to coercion from the former to the latter. This
is safe because the type <code>Node</code> is unsized (basically, it is safe for
the same reason that <code>&amp;mut [i32; 3]</code> -&gt; <code>&amp;mut [i32]</code> is safe). The
fact that <code>&amp;mut None&lt;i32&gt;</code> -&gt; <code>&amp;mut Option&lt;i32&gt;</code> is <em>not</em> safe is an
example of why sized enums can in fact be more challenging here. (If it&rsquo;s
not clear why that should be unsafe, the <a href="https://doc.rust-lang.org/nightly/nomicon/subtyping.html#variance">Nomicon&rsquo;s section on variance</a>
may help clear things up.)</p>
<h4 id="an-alternative-variation">An alternative variation</h4>
<p>If, in fact, we can&rsquo;t solve the subtyping inference problems, there is
another option. Rather than unifying enums and structs, we could add
struct inheritance and leave enums as they are. Things would work
more-or-less the same as in this proposal, but base structs would play
the role of unsized enums, and sized enums would stay how they
are. This can be justified on the basis that enums are used in
different stylistic ways (like <code>Option</code> etc) where e.g. refinement
types and common fields are less important; however, I do find the
setup described in this blog post appealing.</p>
<h4 id="type-parameters-gadts-etc">Type parameters, GADTs, etc</h4>
<p>One other detail I want to note. At least to start, I anticipate a
requirement that every type in the hierarchy has the same set of type
parameters (just like an <code>enum</code> today). If you use the &ldquo;inline&rdquo;
syntax, this is implicit, but you&rsquo;ll have to write it explicitly with
the out of line syntax (we could permit reordering, but there should
be a 1-to-1 correspondence). This simplifies the type-checker and
ensures that this is more of an incremental step in complexity when
compared to today&rsquo;s enums, versus the giant leap we could have
otherwise &ndash; loosening this rule also interacts with monomorphization
and specialization, but I&rsquo;ll dig into that more another time.</p>
<h3 id="conclusion">Conclusion</h3>
<p>This post describes a proposal for unifying structs and enums to make
each of them more powerful. It builds on prior work but adds a few new
twists that close important gaps:</p>
<ul>
<li>Enum bounds for type parameters, allowing for smoother interaction with generic code.</li>
<li>The &ldquo;associated struct&rdquo; for enums, allowing for constructors.</li>
</ul>
<p>One of the big goals of this design is to find something that fits
well within Rust&rsquo;s orthogonal design. Today, data types like enums and
structs are focused on describing data layout and letting you declare
natural relationships that mesh well with the semantics of your
program. Traits, in contrast, are used to write generic code that
works across a heterogeneous range of types. This proposal retains
that character, while alleviating some of the pain points in Rust
today:</p>
<ul>
<li>Support for refinement types and nested enum hierarchies;</li>
<li>Support for common fields shared across variants;</li>
<li>Unsized enums that allow for more efficient memory layout.</li>
</ul>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/></entry><entry><title type="html">Virtual Structs Part 2: Classes strike back</title><link href="https://smallcultfollowing.com/babysteps/blog/2015/05/29/classes-strike-back/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2015/05/29/classes-strike-back/</id><published>2015-05-29T00:00:00+00:00</published><updated>2015-05-29T11:52:26-04:00</updated><content type="html"><![CDATA[<p>This is the second post summarizing my current thoughts about ideas
related to &ldquo;virtual structs&rdquo;. In the <a href="http://smallcultfollowing.com/babysteps/blog/2015/05/05/where-rusts-enum-shines/">last post</a>, I described how,
when coding C++, I find myself missing Rust&rsquo;s enum type. In this post,
I want to turn it around. I&rsquo;m going to describe why the class model
can be great, and something that&rsquo;s actually kind of missing from
Rust. In the next post, I&rsquo;ll talk about how I think we can get the
best of both worlds for Rust. As in the first post, I&rsquo;m focusing here
primarily on the data layout side of the equation; I&rsquo;ll discuss
virtual dispatch afterwards.</p>
<!-- more -->
<h3 id="very-brief-recap">(Very) brief recap</h3>
<p>In the previous post, I described how one can setup a class hierarchy
in C++ (or Java, Scala, etc) with a base class and one subclass for
every variant:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">Error</span> <span class="p">{</span> <span class="p">...</span> <span class="p">};</span>
</span></span><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">FileNotFound</span> <span class="o">:</span> <span class="k">public</span> <span class="n">Error</span> <span class="p">{</span> <span class="p">...</span> <span class="p">};</span>
</span></span><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">UnexpectedChar</span> <span class="o">:</span> <span class="k">public</span> <span class="n">Error</span> <span class="p">{</span> <span class="p">...</span> <span class="p">};</span>
</span></span></code></pre></div><p>This winds up being very similar to a Rust enum:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">enum</span> <span class="nc">ErrorCode</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">FileNotFound</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">UnexpectedChar</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>However, there are are some important differences. Chief among them is
that the Rust enum has a size equal to the size of its largest
variant, which means that Rust enums can be passed &ldquo;by value&rdquo; rather
than using a box. This winds up being absolutely crucial to Rust: it&rsquo;s
what allows us to use <code>Option&lt;&amp;T&gt;</code>, for example, as a zero-cost
nullable pointer. It&rsquo;s what allows us to make arrays of enums (rather
than arrays of boxed enums). It&rsquo;s what allows us to overwrite one enum
value with another, e.g. to change from <code>None</code> to <code>Some(_)</code>. And so
forth.</p>
<h3 id="problem-1-memory-bloat">Problem #1: Memory bloat</h3>
<p>There are a lot of use cases, however, where having a size equal to
the largest variant is actually a handicap. Consider, for example, the
way the rustc compiler represents Rust types (this is actually a
cleaned up and simplified version of the <a href="https://github.com/rust-lang/rust/blob/9854143cba679834bc4ef932858cd5303f015a0e/src/librustc/middle/ty.rs#L1359-L1397">real thing</a>).</p>
<p>The type <code>Ty</code> represents a rust type:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// &#39;tcx is the lifetime of the arena in which we allocate type information
</span></span></span><span class="line"><span class="cl"><span class="k">type</span> <span class="nc">Ty</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;tcx</span><span class="w"> </span><span class="n">TypeStructure</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span></code></pre></div><p>As you can see, it is in fact a reference to a <code>TypeStructure</code> (this
is called <code>sty</code> in the Rust compiler, which isn&rsquo;t completely up to
date with modern Rust conventions). The lifetime <code>'tcx</code> here
represents the lifetime of the arena in which we allocate all of our
type information. So when you see a type like <code>&amp;'tcx</code>, it represents
interned information allocated in an arena. (As an aside, we
<a href="https://github.com/rust-lang/rust/pull/1759">added the arena</a> back before we even had lifetimes at all, and
used to use unsafe pointers here. The fact that we use proper
lifetimes here is thanks to the awesome <a href="https://github.com/eddyb/">eddyb</a> and his super duper
<a href="https://github.com/rust-lang/rust/pull/18483">safe-ty</a> branch. What a guy.)</p>
<p>So, here is the first observation: in practice, we are already boxing
all the instances of <code>TypeStructure</code> (you may recall that the fact
that classes forced us to box was a downside before). We have to,
because types are recursively structured. In this case, the &lsquo;box&rsquo; is
an arena allocation, but still the point remains that we always pass
types by reference. And, moreover, once we create a <code>Ty</code>, it is
immutable &ndash; we never switch a type from one variant to another.</p>
<p>The actual <code>TypeStructure</code> enum is defined something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">enum</span> <span class="nc">TypeStructure</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Bool</span><span class="p">,</span><span class="w">                                      </span><span class="c1">// bool
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Reference</span><span class="p">(</span><span class="n">Region</span><span class="p">,</span><span class="w"> </span><span class="n">Mutability</span><span class="p">,</span><span class="w"> </span><span class="n">Type</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="p">),</span><span class="w"> </span><span class="c1">// &amp;&#39;x T, &amp;&#39;x mut T
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Struct</span><span class="p">(</span><span class="n">DefId</span><span class="p">,</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;tcx</span><span class="w"> </span><span class="n">Substs</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="p">),</span><span class="w">         </span><span class="c1">// Foo&lt;..&gt;
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">Enum</span><span class="p">(</span><span class="n">DefId</span><span class="p">,</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;tcx</span><span class="w"> </span><span class="n">Substs</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="p">),</span><span class="w">           </span><span class="c1">// Foo&lt;..&gt;
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">BareFn</span><span class="p">(</span><span class="o">&amp;</span><span class="na">&#39;tcx</span><span class="w"> </span><span class="n">BareFnData</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="p">),</span><span class="w">            </span><span class="c1">// fn(..)
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>You can see that, in addition to the types themselves, we also intern
a lot of the data in the variants themselves. For example, the
<code>BareFn</code> variant takes a <code>&amp;'tcx BareFnData&lt;'tcx&gt;</code>. The reason we do
this is because otherwise the size of the <code>TypeStructure</code> type
balloons very quickly. This is because some variants, like <code>BareFn</code>,
have a lot of associated data (e.g., the ABI, the types of all the
arguments, etc). In contrast, types like structs or references have
relatively little associated data. Nonetheless, the size of the
<code>TypeStructure</code> type is determined by the largest variant, so it
doesn&rsquo;t matter if all the variants are small but one: the enum is
still large. To fix this, <a href="https://github.com/huonw">Huon</a>
<a href="https://github.com/rust-lang/rust/pull/19549">spent quite a bit of time</a> analyzing the size of each variant
and introducing indirection and interning to bring it down.</p>
<p>Consider what would have happened if we had used classes instead.  In
that case, the type structure might look like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">typedef</span> <span class="n">TypeStructure</span> <span class="o">*</span><span class="n">Ty</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">TypeStructure</span> <span class="p">{</span> <span class="p">..</span> <span class="p">};</span>
</span></span><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">Bool</span> <span class="o">:</span> <span class="k">public</span> <span class="n">TypeStructure</span> <span class="p">{</span> <span class="p">..</span> <span class="p">};</span>
</span></span><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">Reference</span> <span class="o">:</span> <span class="k">public</span> <span class="n">TypeStructure</span> <span class="p">{</span> <span class="p">..</span> <span class="p">};</span>
</span></span><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">Struct</span> <span class="o">:</span> <span class="k">public</span> <span class="n">TypeStructure</span> <span class="p">{</span> <span class="p">..</span> <span class="p">};</span>
</span></span><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">Enum</span> <span class="o">:</span> <span class="k">public</span> <span class="n">TypeStructure</span> <span class="p">{</span> <span class="p">..</span> <span class="p">};</span>
</span></span><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">BareFn</span> <span class="o">:</span> <span class="k">public</span> <span class="n">TypeStructure</span> <span class="p">{</span> <span class="p">..</span> <span class="p">};</span>
</span></span></code></pre></div><p>In this case, whenever we allocated a <code>Reference</code> from the arena, we
would allocate precisely the amount of memory that a <code>Reference</code>
needs. Similarly, if we allocated a <code>BareFn</code> type, we&rsquo;d use more
memory for that particular instance, but it wouldn&rsquo;t affect the other
kinds of types. Nice.</p>
<h3 id="problem-2-common-fields">Problem #2: Common fields</h3>
<p>The definition for <code>Ty</code> that I gave in the previous section was
actually somewhat simplified compared to what we really do in rustc.
The actual definition looks more like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="c1">// &#39;tcx is the lifetime of the arena in which we allocate type information
</span></span></span><span class="line"><span class="cl"><span class="k">type</span> <span class="nc">Ty</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="na">&#39;tcx</span><span class="w"> </span><span class="n">TypeData</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">TypeData</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">id</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">flags</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">structure</span>: <span class="nc">TypeStructure</span><span class="o">&lt;</span><span class="na">&#39;tcx</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>As you can see, <code>Ty</code> is in fact a reference not to a <code>TypeStructure</code>
directly but to a struct wrapper, <code>TypeData</code>. This wrapper defines a
few fields that are common to all types, such as a unique integer id
and a set of flags. We could put those fields into the variants of
<code>TypeStructure</code>, but it&rsquo;d be repetitive, annoying, and inefficient.</p>
<p>Nonetheless, introducing this wrapper struct feels a bit indirect. If
we are using classes, it would be natural for these fields to live on
the base class:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">typedef</span> <span class="n">TypeStructure</span> <span class="o">*</span><span class="n">Ty</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">TypeStructure</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">unsigned</span> <span class="n">id</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">unsigned</span> <span class="n">flags</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">...</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">Bool</span> <span class="o">:</span> <span class="k">public</span> <span class="n">TypeStructure</span> <span class="p">{</span> <span class="p">..</span> <span class="p">};</span>
</span></span><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">Reference</span> <span class="o">:</span> <span class="k">public</span> <span class="n">TypeStructure</span> <span class="p">{</span> <span class="p">..</span> <span class="p">};</span>
</span></span><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">Struct</span> <span class="o">:</span> <span class="k">public</span> <span class="n">TypeStructure</span> <span class="p">{</span> <span class="p">..</span> <span class="p">};</span>
</span></span><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">Enum</span> <span class="o">:</span> <span class="k">public</span> <span class="n">TypeStructure</span> <span class="p">{</span> <span class="p">..</span> <span class="p">};</span>
</span></span><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">BareFn</span> <span class="o">:</span> <span class="k">public</span> <span class="n">TypeStructure</span> <span class="p">{</span> <span class="p">..</span> <span class="p">};</span>
</span></span></code></pre></div><p>In fact, we could go further. There are many variants that share
common bits of data. For example, structs and enums are both just a
kind of nominal type (&ldquo;named&rdquo; type). Almost always, in fact, we wish
to treat them the same. So we could refine the hierarchy a bit to
reflect this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">Nominal</span> <span class="o">:</span> <span class="k">public</span> <span class="n">TypeStructure</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">DefId</span> <span class="n">def_id</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="n">Substs</span> <span class="n">substs</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">Struct</span> <span class="o">:</span> <span class="k">public</span> <span class="n">Nominal</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">Enum</span> <span class="o">:</span> <span class="k">public</span> <span class="n">Nominal</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span></code></pre></div><p>Now code that wants to work uniformly on either a struct or enum could
just take a <code>Nominal*</code>.</p>
<p>Note that while it&rsquo;s relatively easy in Rust to handle the case where
<em>all</em> variants have common fields, it&rsquo;s a lot more awkward to handle a
case like <code>Struct</code> or <code>Enum</code>, where only <em>some</em> of the variants have
common fields.</p>
<p><a id="initialization"></a></p>
<h3 id="problem-3-initialization-of-common-fields">Problem #3: Initialization of common fields</h3>
<p>Rust differs from purely OO languages in that it does not have special
constructors. An instance of a struct in Rust is constructed by
supplying values for all of its fields. One great thing about this
approach is that &ldquo;partially initialized&rdquo; struct instances are never
exposed. However, the Rust approach has a downside, particularly when
we consider code where you have lots of variants with common fields:
there is no way to write a fn that initializes <em>only</em> the common
fields.</p>
<p>C++ and Java take a different approach to initialization based on
<em>constructors</em>. The idea of a constructor is that you first allocate
the complete structure you are going to create, and then execute a
routine which fills in the fields. This approach to constructos has a
lot of problems &ndash; some of which I&rsquo;ll detail below &ndash; and I would not
advocate for adding it to Rust. However, it does make it convenient to
separately abstract over the initialization of base class fields from
subclass fields:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">typedef</span> <span class="n">TypeStructure</span> <span class="o">*</span><span class="n">Ty</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">TypeStructure</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">unsigned</span> <span class="n">id</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">unsigned</span> <span class="n">flags</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    
</span></span><span class="line"><span class="cl">    <span class="n">TypeStructure</span><span class="p">(</span><span class="kt">unsigned</span> <span class="n">id</span><span class="p">,</span> <span class="kt">unsigned</span> <span class="n">flags</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">      <span class="o">:</span> <span class="n">id</span><span class="p">(</span><span class="n">id</span><span class="p">),</span> <span class="n">flags</span><span class="p">(</span><span class="n">flags</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">Bool</span> <span class="o">:</span> <span class="k">public</span> <span class="n">TypeStructure</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">Bool</span><span class="p">(</span><span class="kt">unsigned</span> <span class="n">id</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">      <span class="o">:</span> <span class="n">TypeStructure</span><span class="p">(</span><span class="n">id</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> <span class="c1">// bools have no flags
</span></span></span><span class="line"><span class="cl">    <span class="p">{</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span></code></pre></div><p>Here, the constructor for <code>TypeStructure</code> initializes the
<code>TypeStructure</code> fields, and the <code>Bool</code> constructor initializes the
<code>Bool</code> fields. Imagine we were to add a field to <code>TypeStructure</code> that
is always 0, such as some sort of counter. We could do this without
changing any of the subclasses:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">TypeStructure</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kt">unsigned</span> <span class="n">id</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">unsigned</span> <span class="n">flags</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kt">unsigned</span> <span class="n">counter</span><span class="p">;</span> <span class="c1">// new
</span></span></span><span class="line"><span class="cl">    
</span></span><span class="line"><span class="cl">    <span class="n">TypeStructure</span><span class="p">(</span><span class="kt">unsigned</span> <span class="n">id</span><span class="p">,</span> <span class="kt">unsigned</span> <span class="n">flags</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">      <span class="o">:</span> <span class="n">id</span><span class="p">(</span><span class="n">id</span><span class="p">),</span> <span class="n">flags</span><span class="p">(</span><span class="n">flags</span><span class="p">),</span> <span class="n">counter</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span></code></pre></div><p>If you have a lot of variants, being able to extract the common
initialization code into a function of some kind is pretty important.</p>
<p>Now, I promised a critique of constructors, so here we go. The biggest
reason we do not have them in Rust is that constructors rely on
exposing a partially initialized <code>this</code> pointer. This raises the
question of what value the fields of that <code>this</code> pointer have before
the constructor finishes: in C++, the answer is just undefined
behavior. Java at least guarantees that everything is zeroed. But
since Rust lacks the idea of a &ldquo;universal null&rdquo; &ndash; which is an
important safety guarantee! &ndash; we don&rsquo;t have such a convenient option.
And there are other weird things to consider: what happens if you call
a virtual function during the base type constructor, for example? (The
answer here again varies by language.)</p>
<p>So, I don&rsquo;t want to add OO-style constructors to Rust, but I do want
some way to pull out the initialization code for common fields into a
subroutine that can be shared and reused. This is tricky.</p>
<h3 id="problem-4-refinement-types">Problem #4: Refinement types</h3>
<p>Related to the last point, Rust currently lacks a way to &ldquo;refine&rdquo; the
type of an enum to indicate the set of variants that it might be. It
would be great to be able to say not just &ldquo;this is a <code>TypeStructure</code>&rdquo;,
but also things like &ldquo;this is a <code>TypeStructure</code> that corresponds to
some nominal type (i.e., a struct or an enum), though I don&rsquo;t know
precisely which kind&rdquo;. As you&rsquo;ve probably surmised, making each
variant its own type &ndash; as you would in the classes approach &ndash; gives
you a simple form of refinement types for free.</p>
<p>To see what I mean, consider the class hierarchy we built for <code>TypeStructure</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">typedef</span> <span class="n">TypeStructure</span> <span class="o">*</span><span class="n">Ty</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">TypeStructure</span> <span class="p">{</span> <span class="p">..</span> <span class="p">};</span>
</span></span><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">Bool</span> <span class="o">:</span> <span class="k">public</span> <span class="n">TypeStructure</span> <span class="p">{</span> <span class="p">..</span> <span class="p">};</span>
</span></span><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">Reference</span> <span class="o">:</span> <span class="k">public</span> <span class="n">TypeStructure</span> <span class="p">{</span> <span class="p">..</span> <span class="p">};</span>
</span></span><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">Nominal</span> <span class="o">:</span> <span class="k">public</span> <span class="n">TypeStructure</span> <span class="p">{</span> <span class="p">..</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">Struct</span> <span class="o">:</span> <span class="k">public</span> <span class="n">Nominal</span> <span class="p">{</span> <span class="p">..</span> <span class="p">};</span>
</span></span><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">Enum</span> <span class="o">:</span> <span class="k">public</span> <span class="n">Nominal</span> <span class="p">{</span> <span class="p">..</span> <span class="p">};</span>
</span></span><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">BareFn</span> <span class="o">:</span> <span class="k">public</span> <span class="n">TypeStructure</span> <span class="p">{</span> <span class="p">..</span> <span class="p">};</span>
</span></span></code></pre></div><p>Now, I can pass around a <code>TypeStructure*</code> to indicate &ldquo;any sort of
type&rdquo;, or a <code>Nominal*</code> to indicate &ldquo;a struct or an enum&rdquo;, or a
<code>BareFn*</code> to mean &ldquo;a bare fn type&rdquo;, and so forth.</p>
<p>If we limit ourselves to single inheritance, that means one can
construct an arbitrary tree of refinements. Certainly one can imagine
wanting arbitrary refinements, though in my own investigations I have
always found a tree to be sufficient. In C++ and Scala, of course, one
can use multiple inheritance to create arbitrary refinements, and I
think one can imagine doing something similar in Rust with traits.</p>
<p>As an aside, the right way to handle &lsquo;datasort refinements&rsquo; has been a
topic of discussion in Rust for some time; I&rsquo;ve posted a
<a href="http://smallcultfollowing.com/babysteps/blog/2012/08/24/datasort-refinements/">different proposal</a> in the past, and, somewhat amusingly, my
<a href="http://smallcultfollowing.com/babysteps/blog/2011/12/02/why-case-classes-are-better-than-variant-types/">very first post</a> on this blog was on this topic as well. I
personally find that building on a variant hierarchy, as above, is a
very appealing solution to this problem, because it avoids introducing
a &ldquo;new concept&rdquo; for refinements: it just leverages the same structure
that is giving you common fields and letting you control layout.</p>
<h3 id="conclusion">Conclusion</h3>
<p>So we&rsquo;ve seen that there also advantages to the approach of using
subclasses to model variants. I showed this using the <code>TypeStructure</code>
example, but there are lots of cases where this arises. In the
compiler alone, I would say that the abstract syntax tree, the borrow
checker&rsquo;s <code>LoanPath</code>, the memory categorization <code>cmt</code> types, and
probably a bunch of other cases would benefit from a more class-like
approach. Servo developers have long been requesting something more
class-like for use in the DOM. I feel quite confident that there are
many other crates at large that could similarly benefit.</p>
<p>Interestingly, Rust can gain a lot of the benefits of the subclass
approach&mdash;namely, common fields and refinement types&mdash;just by making
enum variants into types. There have <a href="https://github.com/rust-lang/rfcs/issues/349">been proposals</a> along these
lines before, and I think that&rsquo;s an important ingredient for the final
plan.</p>
<p>Perhaps the biggest difference between the two approaches is the size
of the &ldquo;base type&rdquo;. That is, in Rust&rsquo;s current enum model, the base
type (<code>TypeStructure</code>) is the size of the maximal variant. In the
subclass model, the base class has an indeterminate size, and so must
be referenced by pointer. Neither of these are an &ldquo;expressiveness&rdquo;
distinction&mdash;we&rsquo;ve seen that you can model anything in either
approach. But it has a big effect on how easy it is to write code.</p>
<p>One interesting question is whether we can concisely state conditions
in which one would prefer to have &ldquo;precise variant sizes&rdquo; (class-like)
vs &ldquo;largest variant&rdquo; (enum). I think the &ldquo;precise sizes&rdquo; approach is
better when the following apply:</p>
<ol>
<li>A recursive type (like a tree), which tends to force boxing anyhow.
Examples: the AST or types in the compiler, DOM in servo, a GUI.</li>
<li>Instances never change what variant they are.</li>
<li>Potentially wide variance in the sizes of the variants.</li>
</ol>
<p>The fact that this is really a kind of efficiency tuning is an
important insight. Hopefully our final design can make it relatively
easy to change between the &lsquo;maximal size&rsquo; and the &lsquo;unknown size&rsquo;
variants, since it may not be obvious from the get go which is better.</p>
<h3 id="preview-of-the-next-post">Preview of the next post</h3>
<p>The next post will describe a scheme in which we could wed together
enums and structs, gaining the advantages of both. I don&rsquo;t plan to
touch virtual dispatch yet, but intead just keep focusing on concrete
types.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/></entry><entry><title type="html">Virtual Structs Part 1: Where Rust's enum shines</title><link href="https://smallcultfollowing.com/babysteps/blog/2015/05/05/where-rusts-enum-shines/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2015/05/05/where-rusts-enum-shines/</id><published>2015-05-05T00:00:00+00:00</published><updated>2015-05-05T06:15:26-04:00</updated><content type="html"><![CDATA[<p>One priority for Rust after 1.0 is going to be incorporating some
kind of support for
<a href="https://github.com/rust-lang/rfcs/issues/349">&ldquo;efficient inheritance&rdquo; or &ldquo;virtual structs&rdquo;</a>. In order to
motivate and explain this design, I am writing a series of blog posts
examining how Rust&rsquo;s current abstractions compare with those found in
other languages.</p>
<p>The way I see it, the topic of &ldquo;virtual structs&rdquo; has always had two
somewhat orthogonal components to it. The first component is a
question of how we can generalize and extend Rust enums to cover more
scenarios. The second component is integrating virtual dispatch into
this picture.</p>
<p>I am going to start the series by focusing on the question of
extending enums. This first post will cover some of the strengths of
the current Rust <code>enum</code> design; the next post, which I&rsquo;ll publish
later this week, will describe some of the advantages of a more
&ldquo;class-based&rdquo; approach. Then I&rsquo;ll discuss how we can bring those two
worlds together. After that, I will turn to virtual dispatch, impls,
and matching, and show how they interact.</p>
<!-- more -->
<h3 id="the-rust-enum">The Rust enum</h3>
<p>I don&rsquo;t know about you, but when I work with C++, I find that the
first thing that I miss is the Rust <code>enum</code>. Usually what happens is
that I start out with some innocent-looking C++ enum, like
<code>ErrorCode</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">enum</span> <span class="nc">ErrorCode</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">FileNotFound</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">UnexpectedChar</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">ErrorCode</span> <span class="nf">parse_file</span><span class="p">(</span><span class="n">String</span> <span class="n">file_name</span><span class="p">);</span>
</span></span></code></pre></div><p>As I evolve the code, I find that, in some error cases, I want to
return some additional information. For example, when I return
<code>UnexpectedChar</code>, maybe I want to indicate what character I saw, and
what characters I expected. Because this data isn&rsquo;t the same for all
errors, now I&rsquo;m kind of stuck. I can make a struct, but it has these
extra fields that are only sometimes relevant, which is awkward:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Error</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">ErrorCode</span> <span class="n">code</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    
</span></span><span class="line"><span class="cl">    <span class="c1">// only relevant if UnexpectedChar:
</span></span></span><span class="line"><span class="cl">    <span class="n">Vector</span><span class="o">&lt;</span><span class="kt">char</span><span class="o">&gt;</span> <span class="n">expected</span><span class="p">;</span> <span class="c1">// possible expected characters
</span></span></span><span class="line"><span class="cl">    <span class="kt">char</span> <span class="n">found</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span></code></pre></div><p>This solution is annoying since I have to come up with values for all
these fields, even when they&rsquo;re not relevant. In this case, for
example, I have to create an empty vector and so forth.  And of course
I have to make sure not to read those fields without checking what
kind of error I have first. And it&rsquo;s wasteful of memory to boot. (I
could use a <code>union</code>, but that is kind of a mess of its own.) All in
all, not very good.</p>
<p>One more structured solution is to go to a full-blown class hierarchy:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="k">enum</span> <span class="nc">ErrorCode</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="n">FileNotFound</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">UnexpectedChar</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">Error</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">  <span class="k">public</span><span class="o">:</span>
</span></span><span class="line"><span class="cl">    <span class="n">Error</span><span class="p">(</span><span class="n">ErrorCode</span> <span class="n">ec</span><span class="p">)</span> <span class="o">:</span> <span class="n">errorCode</span><span class="p">(</span><span class="n">ec</span><span class="p">)</span> <span class="p">{</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="k">const</span> <span class="n">ErrorCode</span> <span class="n">errorCode</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">FileNotFoundError</span> <span class="o">:</span> <span class="k">public</span> <span class="n">Error</span> <span class="p">{</span>    
</span></span><span class="line"><span class="cl">  <span class="k">public</span><span class="o">:</span>
</span></span><span class="line"><span class="cl">    <span class="n">FileNotFound</span><span class="p">()</span> <span class="o">:</span> <span class="n">Error</span><span class="p">(</span><span class="n">FileNotFound</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">class</span> <span class="nc">UnexpectedChar</span> <span class="o">:</span> <span class="k">public</span> <span class="n">ErrorCode</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">  <span class="k">public</span><span class="o">:</span>
</span></span><span class="line"><span class="cl">    <span class="n">UnexpectedChar</span><span class="p">(</span><span class="kt">char</span> <span class="n">expected</span><span class="p">,</span> <span class="kt">char</span> <span class="n">found</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">      <span class="o">:</span> <span class="n">Error</span><span class="p">(</span><span class="n">UnexpectedChar</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">        <span class="n">expected</span><span class="p">(</span><span class="n">expected</span><span class="p">),</span>
</span></span><span class="line"><span class="cl">        <span class="n">found</span><span class="p">(</span><span class="n">found</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">{</span> <span class="p">}</span>
</span></span><span class="line"><span class="cl">    
</span></span><span class="line"><span class="cl">    <span class="k">const</span> <span class="kt">char</span> <span class="n">expected</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="k">const</span> <span class="kt">char</span> <span class="n">found</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">};</span>
</span></span></code></pre></div><p>In many ways, this is pretty nice, but there is a problem (besides the
verbosity, I mean). I can&rsquo;t just pass around <code>Error</code> instances by
value, because the size of the <code>Error</code> will vary depending on what
kind of error it is. So I need dynamic allocation. So I can change my
<code>parse_file</code> routine to something like:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-cpp" data-lang="cpp"><span class="line"><span class="cl"><span class="n">unique_ptr</span><span class="o">&lt;</span><span class="n">Error</span><span class="o">&gt;</span> <span class="n">parse_file</span><span class="p">(...);</span>
</span></span></code></pre></div><p>Of course, now I&rsquo;ve wound up with a lot more code, and mandatory
memory allocation, for something that doesn&rsquo;t really seem all that
complicated.</p>
<h3 id="rust-to-the-rescue">Rust to the rescue</h3>
<p>Of course, Rust enums make this sort of thing easy. I can start out
with a simple enum as before:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">enum</span> <span class="nc">ErrorCode</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">FileNotFound</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">UnexpectedChar</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">parse_file</span><span class="p">(</span><span class="n">file_name</span>: <span class="nb">String</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">ErrorCode</span><span class="p">;</span><span class="w">
</span></span></span></code></pre></div><p>Then I can simply modify it so that the variants carry data:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">enum</span> <span class="nc">ErrorCode</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">FileNotFound</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">UnexpectedChar</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">expected</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="nb">String</span><span class="o">&gt;</span><span class="p">,</span><span class="w"> </span><span class="n">found</span>: <span class="kt">char</span> <span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">parse_file</span><span class="p">(</span><span class="n">file_name</span>: <span class="nb">String</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">ErrorCode</span><span class="p">;</span><span class="w">
</span></span></span></code></pre></div><p>And nothing really has to change. I only have to supply values for
those fields when I construct an instance of <code>UnexpectedChar</code>, and I
only read the values when I match a given error. But most importantly,
I don&rsquo;t have to do dummy allocations: the size of <code>ErrorCode</code> is
automatically the size of the largest variant, so I get the benefits
of the a <code>union</code> in C but without the mess and risk.</p>
<h3 id="what-makes-rust-and-c-behave-differently">What makes Rust and C++ behave differently?</h3>
<p>So why does this example work so much more smoothly with a Rust enum
than a C++ class hierarchy? The most obvious difference is that Rust&rsquo;s
enum syntax allows us to compactly declare all the variants in one
place, and of course we enjoy the benefits of match syntax. Such
&ldquo;creature comforts&rdquo; are very nice, but that is not what I&rsquo;m really
talking about in this post.  (For example, Scala is an example of a
language that offers <a href="http://docs.scala-lang.org/tutorials/tour/case-classes.html">great syntactic support</a> for using
&ldquo;classes as variants&rdquo;; but that doesn&rsquo;t change the fundamental
tradeoffs involved.)</p>
<p>To me, the key difference between Rust and C++ is the size of the
<code>ErrorCode</code> types. In Rust, the size of an <code>ErrorCode</code> instance is
equal to <strong>the maximum of its variants</strong>, which means that we can pass
errors around by value and know that we have enough space to store any
kind of error. In contrast, when using classes in C++, the size of an
<code>ErrorCode</code> instance will vary, <strong>depending on what specific variance
it is</strong>. This is why I must pass around errors using a pointer, since
I don&rsquo;t know how much space I need up front. (Well, actually, C++
doesn&rsquo;t <em>require</em> you to pass around values by pointer: but if you
don&rsquo;t, you wind up with <a href="http://stackoverflow.com/questions/274626/what-is-object-slicing">object slicing</a>, which can be a particularly
surprising sort of error. In Rust, we have the notion of <a href="http://smallcultfollowing.com/babysteps/blog/2014/01/05/dst-take-5/">DST</a> to
address this problem.)</p>
<p><strong>Rust really relies deeply on the flat, uniform layout for
enums</strong>. For example, every time you make a nullable pointer like
<code>Option&lt;&amp;T&gt;</code>, you are taking advantage of the fact that options are
laid out flat in memory, whether they are <code>None</code> or <code>Some</code>. (In Scala,
for example, creating a <code>Some</code> variant requires allocating an object.)</p>
<h3 id="preview-of-the-next-few-posts">Preview of the next few posts</h3>
<p>OK, now that I spent a lot of time telling you why enums are great and
subclassing is terrible, my next post is going to tell you why I think
suclassing is sometimes fantastic and enums kind of annoying.</p>
<h3 id="caveat">Caveat</h3>
<p>I&rsquo;m well aware I&rsquo;m picking on C++ a bit unfairly. For example, perhaps
instead of writing up my own little class hierarchy, I should be using
<code>boost::any</code> or something like that. Because C++ is such an extensible
language, you can definitely construct a class hierarchy that gives
you similar advantages to what Rust enums offer. Heck, you could just
write a carefully constructed wrapper around a C <code>union</code> to get what
you want. But I&rsquo;m really focused here on contrasting the kind of &ldquo;core
abstractions&rdquo; that the language offers for handling variants with
data, which in Rust&rsquo;s case is (currently) enums, and in C++&rsquo;s case is
subtyping and classes.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/></entry><entry><title type="html">A few more remarks on reference-counting and leaks</title><link href="https://smallcultfollowing.com/babysteps/blog/2015/04/30/a-few-more-remarks-on-reference-counting-and-leaks/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2015/04/30/a-few-more-remarks-on-reference-counting-and-leaks/</id><published>2015-04-30T00:00:00+00:00</published><updated>2015-04-30T18:00:05-07:00</updated><content type="html"><![CDATA[<p>So there has been a lot of really interesting discussion in response
to my blog post. I wanted to highlight some of the comments I&rsquo;ve seen,
because I think they raise good points that I failed to address in the
blog post itself. My comments here are lightly edited versions of what
I wrote elsewhere.</p>
<h3 id="isnt-the-problem-with-objects-and-leak-safe-types-more-general">Isn&rsquo;t the problem with objects and leak-safe types more general?</h3>
<p><a href="http://www.reddit.com/r/rust/comments/34bj7z/on_referencecounting_and_leaks_from_nmatsakiss/cqtksn3">Reem writes</a>:</p>
<blockquote>
<p>I posit that this is in fact a problem with trait objects, not a
problem with Leak; the exact same flaw pointed about in the blog
post already applies to the existing OIBITs, Send, Sync, and
Reflect. The decision of which OIBITs to include on any trait object
is already a difficult one, and is a large reason why std strives to
avoid trait objects as part of public types.</p>
</blockquote>
<p>I agree with him that the problems I described around <code>Leak</code> and
objects apply equally to <code>Send</code> (and, in fact, I said so in my post),
but I don&rsquo;t think this is something we&rsquo;ll be able to solve later on,
as he suggests. I think we are working with something of a fundamental
tension. Specifically, objects are all about encapsulation. That is,
<strong>they completely hide the type you are working with</strong>, even from the
compiler. This is what makes them useful: without them, Rust just
plain wouldn&rsquo;t work, since you couldn&rsquo;t (e.g.) have a vector of
closures. <strong>But, in order to gain that flexibility, you have to state
your requirements up front</strong>. The compiler can&rsquo;t figure them out
automatically, because it doesn&rsquo;t (and shouldn&rsquo;t) know the types
involved.</p>
<p>So, given that objects are here to stay, the question is whether
adding a marker trait like <code>Leak</code> is a problem, given that we already
have <code>Send</code>. I think the answer is yes; basically, because we can&rsquo;t
expect object types to be analyzed statically, we should do our best
to minimize the number of fundamental splits people have to work
with. <strong>Thread safety is pretty fundamental. I don&rsquo;t think <code>Leak</code>
makes the cut.</strong> (I said some of the reasons in conclusion of my
previous blog post, but I have a few more in the questions below.)</p>
<h3 id="could-we-just-remove-rc-and-only-have-rcscoped-would-that-solve-the-problem">Could we just remove <code>Rc</code> and only have <code>RcScoped</code>? Would that solve the problem?</h3>
<p><a href="http://smallcultfollowing.com/babysteps/blog/2015/04/29/on-reference-counting-and-leaks/#comment-1994859272">Original question.</a></p>
<p>Certainly you could remove <code>Rc</code> in favor of <code>RcScoped</code>. Similarly, you
could have only <code>Arc</code> and not <code>Rc</code>. But you don&rsquo;t want to because you
are basically failing to take advantage of extra constraints. If we
only had <code>RcScoped</code>, for example, then creating an <code>Rc</code> always
requires taking some scoped as argument &ndash; you can have a global
constant for <code>'static</code> data, but it&rsquo;s still the case that generic
abstractions have to take in this scope as argument. Moreover, there
is a runtime cost to maintaining the extra linked list that will
thread through all <code>Rc</code> abstractions (and the <code>Rc</code> structs get bigger,
as well). So, <strong>yes, this avoids the &ldquo;split&rdquo; I talked about, but it
does it by pushing the worst case on all users.</strong></p>
<p>Still, I admit to feeling torn on this point. <strong>What pushes me over
the edge, I think, is that simple reference counting of the kind we
are doing now is a pretty fundamental thing.</strong> You find it in all
kinds of systems
(<a href="http://clang.llvm.org/docs/AutomaticReferenceCounting.html">Objective C</a>,
<a href="https://msdn.microsoft.com/en-us/library/windows/desktop/ms687260%28v=vs.85%29.aspx">COM</a>,
etc). This means that if we require that safe Rust cannot leak, then
you cannot safely integrate borrowed data with those systems. I think
it&rsquo;s better to just use closures in Rust code &ndash; particularly since,
as annodomini points out on Reddit,
<a href="http://www.reddit.com/r/rust/comments/34bj7z/on_referencecounting_and_leaks_from_nmatsakiss/cqt983d">there are other kinds of cases where RAII is a poor fit for cleanup</a>.</p>
<h3 id="could-a-proper-gc-solve-this-is-reference-counting-really-worth-it">Could a proper GC solve this? Is reference counting really worth it?</h3>
<p><a href="http://www.reddit.com/r/rust/comments/34bj7z/on_referencecounting_and_leaks_from_nmatsakiss/cqtpxga">Original question.</a></p>
<p>It&rsquo;ll depend on the precise design, but <strong>tracing GC most definitely
is not a magic bullet</strong>. If anything, the problem around leaks is
somewhat worse: GC&rsquo;s don&rsquo;t give any kind of guarantee about when the
destructor bans. So we either have to ban GC&rsquo;d data from having
destructors or ban it from having borrowed pointers; either of those
implies a bound very similar to <code>Leak</code> or <code>'static</code>. Hence I think
that <strong>GC will never be a &ldquo;fundamental building block&rdquo; for
abstractions in the way that <code>Rc</code>/<code>Arc</code> can be</strong>. This is sad, but
perhaps inevitable: GC inherently requires a runtime as well, which
already limits its reusability.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/></entry><entry><title type="html">On reference-counting and leaks</title><link href="https://smallcultfollowing.com/babysteps/blog/2015/04/29/on-reference-counting-and-leaks/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2015/04/29/on-reference-counting-and-leaks/</id><published>2015-04-29T00:00:00+00:00</published><updated>2015-04-29T12:39:10-07:00</updated><content type="html"><![CDATA[<p>What&rsquo;s a 1.0 release without a little drama? Recently, we discovered
that there was an oversight in one of the standard library APIs that we
had intended to stabilize. In particular, we recently added an API for
scoped threads &ndash; that is, child threads which have access to the
stack frame of their parent thread.</p>
<p>The flaw came about because, when designing the scoped threads API, we
failed to consider the impact of resource leaks. Rust&rsquo;s ownership
model makes it somewhat hard to leak data, but not impossible. In
particular, using reference-counted data, you can construct a cycle in
the heap, in which case the components of that cycle may never be
freed.</p>
<p>Some commenters online have taken this problem with the scoped threads
API to mean that Rust&rsquo;s type system was fundamentally flawed. This is
not the case: Rust&rsquo;s guarantee that safe code is memory safe is as
true as it ever was. The problem was really specific to the scoped
threads API, which was making flawed assumptions; this API has been
marked unstable, and there is an <a href="https://github.com/rust-lang/rfcs/pull/1084">RFC</a> proposing a safe
alternative.</p>
<p>That said, there is an interesting, more fundamental question at play
here. We long ago decided that, to make reference-counting practical,
we had to accept resource leaks as a possibility. But some
<a href="https://github.com/rust-lang/rfcs/pull/1085">recent</a> <a href="https://github.com/rust-lang/rfcs/pull/1094">proposals</a> have suggested that we should place
limits on the <code>Rc</code> type to avoid some kinds of reference leaks. These
limits would make the original scoped threads API safe. <strong>However,
these changes come at a pretty steep price in composability: they
effectively force a deep distinction between &ldquo;leakable&rdquo; and
&ldquo;non-leakable&rdquo; data,</strong> which winds up affecting all levels of the
system.</p>
<p>This post is my attempt to digest the situation and lay out my current
thinking. For those of you don&rsquo;t want to read this entire post (and I
can&rsquo;t blame you, it&rsquo;s long), let me just copy the most salient
paragraph from my conclusion:</p>
<blockquote>
<p>This is certainly a subtle issue, and one where reasonable folk can
disagree. In the process of drafting (and redrafting&hellip;) this post,
my own opinion has shifted back and forth as well. But ultimately I
have landed where I started: <strong>the danger and pain of bifurcating
the space of types far outweighs the loss of this particular RAII
idiom.</strong></p>
</blockquote>
<p>All right, for those of you who want to continue, this post is divided
into three sections:</p>
<ol>
<li>Section 1 explains the problem and gives some historical background.</li>
<li>Section 2 explains the &ldquo;status quo&rdquo;.</li>
<li>Section 3 covers the proposed changes to the reference-counted type
and discusses the tradeoffs involved there.</li>
</ol>
<!-- more -->
<h3 id="section-1-the-problem-in-a-nutshell">Section 1. The problem in a nutshell</h3>
<p>Let me start by summarizing the problem that was uncovered in more
detail. The root of the problem is an interaction between the
reference-counting and threading APIs in the standard library. So
let&rsquo;s look at each in turn. If you&rsquo;re familiar with the problem, you
can skip ahead to section 2.</p>
<h4 id="reference-counting-as-the-poor-mans-gc">Reference-counting as the poor man&rsquo;s GC</h4>
<p>Rust&rsquo;s standard library includes the <code>Rc</code> and <code>Arc</code> types which are
used for reference-counted data. These are widely used, because they
are the most convenient way to create data whose ownership is shared
amongst many references rather than being tied to a particular stack
frame.</p>
<p>Like all reference-counting systems, <code>Rc</code> and <code>Arc</code> are vulnerable to
reference-count cycles. That is, if you create a reference-counted box
that contains a reference to itself, then it will never be
collected. <strong>To put it another way, Rust gives you a lot of safety
guarantees, but it doesn&rsquo;t protect you from memory leaks (or
deadlocks, which turns out to be a very similar problem).</strong></p>
<p>The fact that we don&rsquo;t protect against leaks is not an accident. This
was a deliberate design decision that we made while transitioning from
garbage-collected types (<code>@T</code> and <code>@mut T</code>) to user-defined reference
counting. The reason is that preventing leaks requires either a
runtime with a cycle collector or complex type-system tricks. The
option of a mandatory runtime was out, and the type-system tricks we
explored were either too restrictive or too complex. So we decided to
make a pragmatic compromise: to document the possibility of leaks
(see, for example, <a href="http://doc.rust-lang.org/reference.html#behavior-not-considered-unsafe">this section of the Rust reference manual</a>)
and move on.</p>
<p>In practice, the possibility of leaks is mostly an interesting
technical caveat: I&rsquo;ve not found it to be a big issue in practice.
Perhaps because problems arose so rarely in practice, some
things&mdash;like leaks&mdash;that should not have been forgotten
were&hellip; partially forgotten. History became legend. Legend became
myth. And for a few years, the question of leaks seemed to be a
distant, settled issue, without much relevance to daily life.</p>
<h4 id="thread-and-shared-scopes">Thread and shared scopes</h4>
<p>With that background on <code>Rc</code> in place, let&rsquo;s turn to threads.
Traditionally, Rust threads were founded on a &ldquo;zero-sharing&rdquo;
principle, much like Erlang. However, as Rust&rsquo;s type system evolved,
we realized we could <a href="http://smallcultfollowing.com/babysteps/blog/2013/06/11/data-parallelism-in-rust/">do much better</a> &ndash; <strong>the
<a href="http://smallcultfollowing.com/babysteps/blog/2013/06/11/on-the-connection-between-memory-management-and-data-race-freedom/">same type system rules</a> that ensured memory safe in sequential
code could be used to permit sharing in parallel code as well</strong>,
particularly once we adopted <a href="https://github.com/rust-lang/rfcs/blob/master/text/0458-send-improvements.md">RFC 458</a> (a brilliant insight by
<a href="https://github.com/pythonesque">pythonesque</a>).</p>
<p>The basic idea is to start a child thread that is tied to a particular
scope in the code. We want to guarantee that before we exit that
scope, the thread will be joined. If we can do this, then we can
safely permit that child thread access to stack-allocated data, so
long as that data outlives the scope; this is safe because Rust&rsquo;s
type-system rules already ensure that any data shared between multiple
threads must be immutable (more or less, anyway).</p>
<p>So the question then is how can we designate the scope of the children
threads, and how can we ensure that the children will be joined when
that scope exits. The original proposal was based on closures, but in
the time since it was written, the language has shifted to using more
RAII, and hence the <code>scoped</code> API is based on RAII. The idea is pretty
simple.  You write a call like the following:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="p">(</span><span class="n">data</span>: <span class="kp">&amp;</span><span class="p">[</span><span class="kt">i32</span><span class="p">])</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="n">guard</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">thread</span>::<span class="n">scoped</span><span class="p">(</span><span class="o">||</span><span class="w"> </span><span class="cm">/* body of the child thread */</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>The <code>scoped</code> function takes a closure which will be the body of the
child thread. It returns to you a guard value: running the destructor
of this guard will cause the thread to be joined. This guard is always
tied to a particular scope in the code. Let&rsquo;s call the scope <code>'a</code>. The
closure is then permitted access to all data that outlives <code>'a</code>.  For
example, in the code snippet above, <code>'a</code> might be the body of the
function <code>foo</code>. This means that the closure could safely access the
input <code>data</code>, because that must outlive the fn body. The type system
ensures that no reference to the guard exists outside of <code>'a</code>, and
hence we can be sure that guard will go out of scope sometime before
the end of <code>'a</code> and thus trigger the thread to be joined. At least
that was the idea.</p>
<h4 id="the-conflict">The conflict</h4>
<p>By now perhaps you have seen the problem. The scoped API is only
safe if we can guarantee that the guard&rsquo;s destructor runs, so that the
thread will be joined; but, using <code>Rc</code>, we can leak values, which
means that their destructors never run. So, by combining <code>Rc</code> and
<code>scoped</code>, we can cause a thread to be launched that will never be
joined.  This means that this thread could run at any time and try to
access data from its parents stack frame &ndash; even if that parent has
already completed, and thus the stack frame is garbage. Not good!</p>
<p>So where does the fault lie? From the point of view of <em>history</em>, it
is pretty clear: the <code>scoped</code> API was ill designed, given that <code>Rc</code>
already existed. As I wrote, we had long ago decided that the most
practical option was to accept that leaks could occur. This implies
that if the memory safety of an API depends on a destructor running,
you can&rsquo;t relinquish ownership of the value that carries that
destructor (because the end-user might leak it).</p>
<p>It is totally possible to fix the <code>scoped</code> API, and in fact there is
already <a href="https://github.com/rust-lang/rfcs/pull/1084">an RFC showing how this can be done</a> (I&rsquo;ll summarize it
in section 2, below). However, some people feel that the decision we
made to permit leaks was the wrong one, and that we ought to have some
limits on the RC API to prevent leaks, or at least prevent <em>some</em>
leaks. I&rsquo;ll dig into those proposals in section 3.</p>
<h3 id="section-2-what-is-the-impact-of-leaks-on-the-status-quo">Section 2. What is the impact of leaks on the status quo?</h3>
<p>So, if we continue with the status quo, and accept that resource leaks
can occur with <code>Rc</code> and <code>Arc</code>, what is the impact of that?  At first
glance, it might seem that the possibility of resource leaks is a huge
blow to RAII. After all, if you can&rsquo;t be sure that the destructor will
run, how can you rely on the destructor to do cleanup?  But when you
look closer, it turns out that the problem is a lot more narrow.</p>
<h4 id="average-rust-user">&ldquo;Average Rust User&rdquo;</h4>
<p>I think it&rsquo;s helpful to come at this problem from two difference
perspectives. The first is: what do resource leaks mean for the
average Rust user? I think the right way to look at this is that the
user of the <code>Rc</code> API has an obligation to avoid cycle leaks or break
cycles. Failing to do so will lead to bugs &ndash; these could be resource
leaks, deadlocks, or other things. <strong>But leaks cannot lead to memory
unsafety.</strong> (Barring invalid unsafe code, of course.)</p>
<p>It&rsquo;s worth pointing out that even if you are using <code>Rc</code>, you don&rsquo;t
have to worry about memory leaks due to forgetting to decrement a
reference or anything like that. The problem really boils down to
ensuring that you have a clear strategy for avoiding cycles, which
usually boils to an &ldquo;ownership DAG&rdquo; of strong references (though in
some cases, breaking cycles explicitly may also be an option).</p>
<h4 id="author-of-unsafe-code">&ldquo;Author of unsafe code&rdquo;</h4>
<p>The other perspective to consider is the person who is writing unsafe
code. Unsafe code frequently relies on destructors to do cleanup.  I
think the right perspective here is to view a destructor as akin to
any other user-facing function: in particular, it is the user&rsquo;s
responsibility to call it, and they may accidentally fail to do
so. Just as you have to write your API to be defensive about users
invoking functions in the wrong order, you must be defensive about
them failing to invoke destructors due to a resource leak.</p>
<p>It turns out that the majority of RAII idioms are actually perfectly
memory safe even if the destructors don&rsquo;t run. For example, if we
examine the Rust standard library, it turns out that <em>all</em> of the
destructors therein are either optional or can be made optional:</p>
<ol>
<li>Straight-forward destructors like <code>Box</code> or <code>Vec</code> leak memory if
they are not freed; clearly no worse than the original leak.</li>
<li>Leaking a <a href="http://doc.rust-lang.org/std/sync/struct.MutexGuard.html">mutex guard</a> means that the mutex will never be released.
This is likely to cause deadlock, but not memory unsafety.</li>
<li>Leaking a <a href="http://doc.rust-lang.org/std/cell/struct.Ref.html"><code>RefCell</code> guard</a> means that the <code>RefCell</code> will remain
in a borrowed state. This is likely to cause thread panic, but not memory
unsafety.</li>
<li>Even fancy iterator APIs like <code>drain</code>, which was initially thought
to be problematic, can be implemented in
<a href="http://cglab.ca/~abeinges/blah/everyone-poops/">such a way that they cause leaks to occur if they are leaked</a>,
but not memory unsafety.</li>
</ol>
<p>In all of these cases, there is a guard value that mediates access to
some underlying value. The type system already guarantees that the
original value cannot be accessed while the guard is in scope. But how
can we ensure safety outside of that scope in the case where the guard
is leaked? If you look at the the cases above, I think they can be
grouped into two patterns:</p>
<ol>
<li><em>Ownership:</em> Things like <code>Box</code> and <code>Vec</code> simply own the values they are
protecting. This means that if they are leaked, those values are also
leaked, and hence there is no way for the user to access it.</li>
<li><em>Pre-poisoning:</em> Other guards, like <code>MutexGuard</code>, put the value
they are protecting into a poisoned state that will lead to dynamic
errors (but not memory unsafety) if the value is accessed without
having run the destructor.  In the case of <code>MutexGuard</code>, the
&ldquo;poisoned&rdquo; state is that the mutex is locked, which means a later
attempt to lock it will simply deadlock unless the <code>MutexGuard</code> has
been dropped.</li>
</ol>
<h4 id="what-makes-scoped-threads-different">What makes scoped threads different?</h4>
<p>So if most RAII patterns continue to work fine, what makes scoped
different? I think there is a fundamental difference between scoped
and these other APIs; this difference was <a href="https://github.com/rust-lang/rfcs/pull/1084#issuecomment-96875651">well articulated</a> by
Kevin Ballard:</p>
<blockquote>
<p><code>thread::scoped</code> is special because it&rsquo;s using the RAII guard as a
proxy to represent values on the stack, but this proxy is not
actually used to access those values.</p>
</blockquote>
<p>If you recall, I mentioned above that all the guards serve to mediate
access to some value. In the case of <code>scoped</code>, the guard is mediating
access to the result of a computation &ndash; the data that is being
protected is &ldquo;everything that the closure may touch&rdquo;. The guard, in
other words, doesn&rsquo;t really know the specific set of affected data,
and it thus cannot hope to either own or pre-poison the data.</p>
<p>In fact, I would take this a step farther, and say that I think that
in this kind of scenario, where the guard doesn&rsquo;t have a connection to
the data being protected, RAII tends to be a poor fit. This is
because, generally, the guard doesn&rsquo;t have to be used, so it&rsquo;s easy
for the user to accidentally drop the guard on the floor, causing the
side-effects of the guard (in this case, joining the thread) to occur
too early. I&rsquo;ll spell this out a bit more in the section below.</p>
<p><strong>Put more generally, accepting resource leaks does mean that there is
a Rust idiom that does not work.</strong> In particular, it is not possible
to create a borrowed reference that can be guaranteed to execute
arbitrary code just before it goes out of scope. What we&rsquo;ve seen
though is that, frequently, it is not necessary to <em>guarantee</em> that
the code will execute &ndash; but in the case of scoped, because there is
no direct connection to the data being protected, joining the thread
is the only solution.</p>
<h4 id="using-closures-to-guarantee-code-execution-when-exiting-a-scope">Using closures to guarantee code execution when exiting a scope</h4>
<p>If we can&rsquo;t use an RAII-based API to ensure that a thread is joined,
what can we do? It turns out that there is a good alternative, laid
out in <a href="https://github.com/rust-lang/rfcs/pull/1084">RFC 1084</a>. The basic idea is to restructure the API so
that you create a &ldquo;thread scope&rdquo; and spawn threads into that scope (in
fact, the RFC lays out a more general version that can be used not
only for threads but for any bit of code that needs guaranteed
execution on exit from a scope). This thread scope is delinated using
a closure. In practical terms, this means that started a scoped thread
look something like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">foo</span><span class="p">(</span><span class="n">data</span>: <span class="kp">&amp;</span><span class="p">[</span><span class="kt">i32</span><span class="p">])</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">thread</span>::<span class="n">scope</span><span class="p">(</span><span class="o">|</span><span class="n">scope</span><span class="o">|</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="n">future</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">scope</span><span class="p">.</span><span class="n">spawn</span><span class="p">(</span><span class="o">||</span><span class="w"> </span><span class="cm">/* body of the child thread */</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>As you can see, whereas before calling <code>thread::scoped</code> started a new
thread immediately, it now just creates a thread scope &ndash; it doesn&rsquo;t
itself start any threads. A borrowed reference to the thread scope is
passed to a closure (here it is the argument <code>scope</code>). The thread
scope offers a method <code>spawn</code> that can be used to start a new thread
tied to a specific scope. This thread will be joined when the closure
returns; as such, it has access to any data that outlives the body of
the closure. Note that the <code>spawn</code> method still returns a future to
the result of the spawned thread; this future is similar to the old
join guard, because it can be used to join the thread early. But this
future doesn&rsquo;t have a destructor. If the thread is not joined through
the future, it will still be automatically joined when the closure
returns.</p>
<p>In the case of this particular API, I think closures are a better fit
than RAII. In particular, the closure serves to make the scope where
the threads are active clear and explicit; this in turn avoids certain
footguns that were possible with the older, RAII-based API. To see an
example of what I mean, consider this code that uses the old API to do
a parallel <a href="http://en.wikipedia.org/wiki/Quicksort">quicksort</a>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">fn</span> <span class="nf">quicksort</span><span class="p">(</span><span class="n">data</span>: <span class="kp">&amp;</span><span class="nc">mut</span><span class="w"> </span><span class="p">[</span><span class="kt">i32</span><span class="p">])</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="k">if</span><span class="w"> </span><span class="n">data</span><span class="p">.</span><span class="n">len</span><span class="p">()</span><span class="w"> </span><span class="o">&lt;=</span><span class="w"> </span><span class="mi">1</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="k">return</span><span class="p">;</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="n">pivot</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">data</span><span class="p">.</span><span class="n">len</span><span class="p">()</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="mi">2</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="n">index</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">partition</span><span class="p">(</span><span class="n">data</span><span class="p">,</span><span class="w"> </span><span class="n">pivot</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="p">(</span><span class="n">left</span><span class="p">,</span><span class="w"> </span><span class="n">right</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">data</span><span class="p">.</span><span class="n">split_at_mut</span><span class="p">(</span><span class="n">data</span><span class="p">,</span><span class="w"> </span><span class="n">index</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="n">_guard1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">thread</span>::<span class="n">scoped</span><span class="p">(</span><span class="o">||</span><span class="w"> </span><span class="n">quicksort</span><span class="p">(</span><span class="n">left</span><span class="p">));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="n">_guard2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">thread</span>::<span class="n">scoped</span><span class="p">(</span><span class="o">||</span><span class="w"> </span><span class="n">quicksort</span><span class="p">(</span><span class="n">right</span><span class="p">));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>I want to draw attention to one snippet of code at the end:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="p">(</span><span class="n">left</span><span class="p">,</span><span class="w"> </span><span class="n">right</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">data</span><span class="p">.</span><span class="n">split_at_mut</span><span class="p">(</span><span class="n">data</span><span class="p">,</span><span class="w"> </span><span class="n">index</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="n">_guard1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">thread</span>::<span class="n">scoped</span><span class="p">(</span><span class="o">||</span><span class="w"> </span><span class="n">quicksort</span><span class="p">(</span><span class="n">left</span><span class="p">));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="kd">let</span><span class="w"> </span><span class="n">_guard2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">thread</span>::<span class="n">scoped</span><span class="p">(</span><span class="o">||</span><span class="w"> </span><span class="n">quicksort</span><span class="p">(</span><span class="n">right</span><span class="p">));</span><span class="w">
</span></span></span></code></pre></div><p>Notice that we have to make dummy variables like <code>_guard1</code> and
<code>_guard2</code>. If we left those variables off, then the thread would be
immediately joined, which means we wouldn&rsquo;t get any actual
parallelism. What&rsquo;s worse, the code would still work, it would just
run sequentially. The need for these dummy variables, and the
resulting lack of clarity about just when parallel threads will be
joined, is a direct result of using RAII here.</p>
<p>Compare that code above to using a closure-based API:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="w">  </span><span class="n">thread</span>::<span class="n">scope</span><span class="p">(</span><span class="o">|</span><span class="n">scope</span><span class="o">|</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="kd">let</span><span class="w"> </span><span class="p">(</span><span class="n">left</span><span class="p">,</span><span class="w"> </span><span class="n">right</span><span class="p">)</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">data</span><span class="p">.</span><span class="n">split_at_mut</span><span class="p">(</span><span class="n">data</span><span class="p">,</span><span class="w"> </span><span class="n">index</span><span class="p">);</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">scope</span><span class="p">.</span><span class="n">spawn</span><span class="p">(</span><span class="o">||</span><span class="w"> </span><span class="n">quicksort</span><span class="p">(</span><span class="n">left</span><span class="p">));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">scope</span><span class="p">.</span><span class="n">spawn</span><span class="p">(</span><span class="o">||</span><span class="w"> </span><span class="n">quicksort</span><span class="p">(</span><span class="n">right</span><span class="p">));</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="p">});</span><span class="w">
</span></span></span></code></pre></div><p>I think it&rsquo;s much clearer. Moreover, the closure-based API opens the
door to other methods that could be used with <code>scope</code>, like
convenience methods to do parallel maps and so forth.</p>
<h3 id="section-3-can-we-prevent-some-resource-leaks">Section 3. Can we prevent (some) resource leaks?</h3>
<p>Ok, so in the previous two sections, I summarized the problem and
discussed the impact of resource leaks on Rust. But what if we could
avoid resource leaks in the first place? There have been two RFCs on
this topic: <a href="https://github.com/rust-lang/rfcs/pull/1085">RFC 1085</a> and <a href="https://github.com/rust-lang/rfcs/pull/1094">RFC 1094</a>.</p>
<p>The two RFCs are quite different in the details, but share a common
theme. The idea is not to avoid all resource leaks altogether; I think
everyone recognizes that this is not practical. Instead, the goal is
to try and divide types into two groups: those that can be safely
leaked, and those that cannot. You then limit the <code>Rc</code> and <code>Arc</code> types
so that they can only be used with types that can safely be leaked.</p>
<p>This approach seems simple but it has deep ramifications. It means
that <code>Rc</code> and <code>Arc</code> are no longer fully general container
types. Generic code that wishes to operate on data of all types
(meaning both types that can and cannot leak) can&rsquo;t use <code>Rc</code> or <code>Arc</code>
internally, at least not without some hard choices.</p>
<p>Rust already has a lot of precedent for categorizing types. For
example, we use a trait <code>Send</code> to designate &ldquo;types that can safely be
transferred to other threads&rdquo;. In some sense, dividing types into
leak-safe and not-leak-safe is analogous. But my experience has been
that every time we draw a fundamental distinction like that, it
carries a high price. This distinction &ldquo;bubbles up&rdquo; through APIs and
affects decisions at all levels. In fact, we&rsquo;ve been talking about one
case of this rippling effect through this post &ndash; the fact that we
have two reference-counting types, one atomic (<code>Arc</code>) and one not
(<code>Rc</code>), is precisely because we want to distinguish thread-safe and
non-thread-safe operations, so that we can get better performance when
thread safety is not needed.</p>
<p><strong>What this says to me is that we should be very careful when
introducing blanket type distinctions.</strong> The places where we use this
mechanism today &ndash; thread-safety, copyability &ndash; are fundamental to
the language, and very important concepts, and I think they carry
their weight. Ultimately, I don&rsquo;t think resource leaks quite fit the
bill. But let me dive into the RFCs in question and try to explain
why.</p>
<h4 id="rfc-1085--the-leak-trait">RFC 1085 &ndash; the Leak trait</h4>
<p>The first of the two RFCs is <a href="https://github.com/rust-lang/rfcs/pull/1085">RFC 1085</a>. This RFC introduces a
trait called <code>Leak</code>, which operates exactly like the existing <code>Send</code>
trait. It indicates &ldquo;leak-safe&rdquo; data. Like <code>Send</code>, it is implemented
by default.  If you wish to make leaks impossible for a type, you can
explicitly opt out with a negative impl like <code>impl !Leak for MyType</code>.
When you create a <code>Rc&lt;T&gt;</code> or <code>Arc&lt;T&gt;</code>, either <code>T: Leak</code> must hold, or
else you must use an unsafe constructor to certify that you will not
create a reference cycle.</p>
<p>The fact that <code>Leak</code> is automatically implemented promises to make it
mostly invisible. Indeed, in the prototype that <a href="https://github.com/reem/">Jonathan Reem</a>
implemented, he found relatively little fallout in the standard
library and compiler. While encouraging, I still think we&rsquo;re going to
encounter problems of composability over time.</p>
<p>There are a couple of scenarios where the <code>Leak</code> trait will, well,
leak into APIs where it doesn&rsquo;t seem to belong. One of the most
obvious is trait objects. Imagine I am writing a serialization
library, and I have a <code>Serializer</code> type that combines an output stream
(a <code>Box&lt;Writer&gt;</code>) along with some serialization state:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Serializer</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">output_stream</span>: <span class="nb">Box</span><span class="o">&lt;</span><span class="n">Writer</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">serialization_state</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>So far so good. Now someone else comes along and would like to use my
library. They want to put this <code>Serializer</code> into a reference counted
box that is shared amongst many users, so they try to make a
<code>Rc&lt;Serializer&gt;</code>. Unfortunately, this won&rsquo;t work. This seems somewhat
surprising, since weren&rsquo;t all types were supposed to be <code>Leak</code> by
default?</p>
<p>The problem lies in the <code>Box&lt;Writer&gt;</code> object &ndash; an object is designed
to hide the precise type of <code>Writer</code> that we are working with. That
means that we don&rsquo;t know whether this particular <code>Writer</code> implements
<code>Leak</code> or not. For this client to be able to place <code>Serializer</code> into
an <code>Rc</code>, there are two choices. The client can use <code>unsafe</code> code, or
I, the library author, can modify my <code>Serializer</code> definition as
follows:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Serializer</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">output_stream</span>: <span class="nb">Box</span><span class="o">&lt;</span><span class="n">Writer</span><span class="o">+</span><span class="n">Leak</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="n">serialization_state</span>: <span class="kt">u32</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This is what I mean by <code>Leak</code> &ldquo;bubbling up&rdquo;. It&rsquo;s already the case
that I, as a library author, want to think about whether my types can
be used across threads and try to enable that. Under this proposal, I
also have to think about whether my types should be usable in <code>Rc</code>,
and so forth.</p>
<p>Now, if you avoid trait objects, the problem is smaller. One advantage
of generics is that they don&rsquo;t encapsulate what type of writer you are
using and so forth, which means that the compiler can analyze the type
to see whether it is thread-safe or leak-safe or whatever. Until now
we&rsquo;ve found that many libraries avoid trait objects partly for this
reason, and I think that&rsquo;s good practice in simple cases. But as things scale up,
encapsulation is a really useful mechanism for simplifying type annotations and
making programs concise and easy to work with.</p>
<p>There is one other point. <a href="https://github.com/rust-lang/rfcs/pull/1085">RFC 1085</a> also includes an unsafe
constructor for <code>Rc</code>, which in principle allows you to continue using
<code>Rc</code> with any type, so long as you are in a position to assert that no
cycles exist. But I feel like this puts the burden of unsafety into
the wrong place. I think you should be able to construct
reference-counted boxes, and truly generic abstractions built on
reference-counted boxes, without writing unsafe code.</p>
<p>My allergic reaction to requiring <code>unsafe</code> to create <code>Rc</code> boxes stems
from a very practical concern: if we push the boundaries of unsafety
too far out, such that it is common to use an unsafe keyword here and
there, we vastly weaken the safety guarantees of Rust <em>in
practice</em>. I&rsquo;d rather that we increase the power of safe APIs at the
cost of more restrictions on unsafe code. Obviously, there is a
tradeoff in the other direction, because if the requirements on unsafe
code become too subtle, people are bound to make mistakes there too,
but my feeling is that requiring people to consider leaks doesn&rsquo;t
cross that line yet.</p>
<h4 id="rfc-1094--avoiding-reference-leaks">RFC 1094 &ndash; avoiding reference leaks</h4>
<p><a href="https://github.com/rust-lang/rfcs/pull/1094">RFC 1094</a> takes a different tack. Rather than dividing types
arbitrarily into leak-safe and not-leak-safe, it uses an existing
distinction, and says that any type which is associated with a scope
cannot leak.</p>
<p>The goal of <a href="https://github.com/rust-lang/rfcs/pull/1094">RFC 1094</a> is to enable a particular &ldquo;mental model&rdquo;
about what lifetimes mean. Specifically, the RFC aims to ensure that
if a value is limited to a particular scope <code>'a</code>, then the value will
be destroyed before the program exits the scope <code>'a</code>. This is very
similar to what Rust currently guarantees, but stronger: in current
Rust, there is no guarantee that your value will be destroyed, there
is only a guarantee that it will not be accessed outside that
scope. Concretely, if you leak an <code>Rc</code> into the heap today, that <code>Rc</code>
may contain borrowed references, and those references could be invalid
&ndash; but it doesn&rsquo;t matter, because Rust guarantees that you could never
use them.</p>
<p>In order to guarantee that borrowed data is never leaked,
<a href="https://github.com/rust-lang/rfcs/pull/1094">RFC 1094</a> requires that to construct a <code>Rc&lt;T&gt;</code> (or <code>Arc&lt;T&gt;</code>),
the condition <code>T: 'static</code> must hold. In other words, the payload of a
reference-counted box cannot contain borrowed data. This by itself is
very limiting: lots of code, including the rust compiler, puts
borrowed pointers into reference-counted structures. To help with
this, the RFC includes a second type of reference-counted box,
<code>ScopedRc</code>. To use a <code>ScopedRc</code>, you must first create a
reference-counting scope <code>s</code>. You can then create new <code>ScopedRc</code>
instances associated with <code>s</code>. These <code>ScopedRc</code> instances carry their
own reference count, and so they will be freed normally as soon as
that count drops to zero. But if they should get placed into a cycle,
then when the scope <code>s</code> is dropped, it will go along and &ldquo;cycle
collect&rdquo;, meaning that it runs the destructor for any <code>ScopedRc</code>
instances that haven&rsquo;t already been freed. (Interestingly, this is
very similar to the closure-based scoped thread API, but instead of
joining threads, exiting the scope reaps cycles.)</p>
<p>I originally found this RFC appealing. It felt to me that it avoided
adding a new distinction (<code>Leak</code>) to the type system and instead
piggybacked on an existing one (borrowed vs non-borrowed). This seems
to help with some of my concerns about &ldquo;ripple effects&rdquo; on users.</p>
<p><strong>However, even though it piggybacks on an existing distinction
(borrowed vs static), the RFC now gives that distinction additional
semantics it didn&rsquo;t have before.</strong> Today, those two categories can be
considered on a single continuum: for all types, there is some
bounding scope (which may be <code>'static</code>), and the compiler ensures that
all accesses to that data occur within that scope. Under RFC 1094,
there is a discontinuity. Data which is bounded by <code>'static</code> is
different, because it may leak.</p>
<p>This discontinuity is precisely why we have to split the type <code>Rc</code>
into two types, <code>Rc</code> and <code>ScopedRc</code>. In fact, the RFC doesn&rsquo;t really
mention <code>Arc</code> much, but presumably there will have to be both
<code>ScopedRc</code> and a <code>ScopedArc</code> types. So now where we had only two
types, we have four, to account for this new axis:</p>
<pre tabindex="0"><code>|-----------------++--------+----------|
|                 || Static | Borrowed |
|-----------------++--------+----------|
| Thread-safe     || Rc     | RcScoped |
| Not-thread-safe || Arc    | ArcScope |
|-----------------++--------+----------|
</code></pre><p>And, in fact, the distinction doesn&rsquo;t end here. There are
abstractions, such as channels, that built on <code>Arc</code>. So this means
that this same categorization will bubble up through those
abstractions, and we will (presumably) wind up with <code>Channel</code> and
<code>ChannelScoped</code> (otherwise, channels cannot be used to send borrowed
data to scoped threads, which is a severe limitation).</p>
<h3 id="section-4-conclusion">Section 4. Conclusion.</h3>
<p>This concludes my deep dive into the question of resource leaks. It
seems to me that the tradeoffs here are not simple. The status quo,
where resource leaks are permitted, helps to ensure composability by
allowing <code>Rc</code> and <code>Arc</code> to be used uniformly on all types. I think
this is very important as these types are vital building blocks.</p>
<p>On a historical note, I am particularly sensitive to concerns of
composability. Early versions of Rust, and in particular the borrow
checker before we adopted the current semantics, were rife with
composability problems. This made writing code very annoying &ndash; you
were frequently refactoring APIs in small ways to account for this.</p>
<p>However, this composability does come at the cost of a useful RAII
pattern. Without leaks, you&rsquo;d be able to use RAII to build references
that reliably execute code when they are dropped, which in turn allows
RAII-like techniques to be used more uniformly across all safe APIs.</p>
<p>This is certainly a subtle issue, and one where reasonable folk can
disagree. In the process of drafting (and redrafting&hellip;) this post, my
own opinion has shifted back and forth as well. But ultimately I have
landed where I started: <strong>the danger and pain of bifurcating the space
of types far outweighs the loss of this particular RAII idiom.</strong></p>
<p>Here are the two most salient points to me:</p>
<ol>
<li>The vast majority of RAII-based APIs are either safe or can be made
safe with small changes. The remainder can be expressed with
closures.
<ul>
<li>With regard to RAII, the scoped threads API represents something
of a &ldquo;worst case&rdquo; scenario, since the guard object is completely
divorced from the data that the thread will access.</li>
<li>In cases like this, where there is often no <em>need</em> to retain the
guard, but dropping it has important side-effects, RAII can be a
footgun and hence is arguably a poor fit anyhow.</li>
</ul>
</li>
<li>The cost of introducing a new fundamental distinction (&ldquo;leak-safe&rdquo;
vs &ldquo;non-leak-safe&rdquo;) into our type system is very high and will be
felt up and down the stack.  This cannot be completely hidden or
abstracted away.
<ul>
<li>This is similar to thread safety, but leak-safety is far less fundamental.</li>
</ul>
</li>
</ol>
<p>Bottom line: the cure is worse than the disease.</p>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/></entry><entry><title type="html">Modeling graphs in Rust using vector indices</title><link href="https://smallcultfollowing.com/babysteps/blog/2015/04/06/modeling-graphs-in-rust-using-vector-indices/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2015/04/06/modeling-graphs-in-rust-using-vector-indices/</id><published>2015-04-06T00:00:00+00:00</published><updated>2015-04-06T14:58:37-04:00</updated><content type="html"><![CDATA[<p>After reading <a href="http://featherweightmusings.blogspot.com/2015/04/graphs-in-rust.html">nrc&rsquo;s blog post about graphs</a>, I felt inspired to
write up an alternative way to code graphs in Rust, based on vectors
and indicates. This encoding has certain advantages over using <code>Rc</code>
and <code>RefCell</code>; in particular, I think it&rsquo;s a closer fit to Rust&rsquo;s
ownership model. (Of course, it has disadvantages too.)</p>
<p>I&rsquo;m going to describe a simplified version of the strategy that rustc
uses internally. The <a href="https://github.com/rust-lang/rust/blob/master/src/librustc/middle/graph.rs">actual code in Rustc</a> is written in a
somewhat dated &ldquo;Rust dialect&rdquo;. I&rsquo;ve also put the sources to this blog
post in their <a href="https://github.com/nikomatsakis/simple-graph">own GitHub repository</a>. At some point, presumably
when I come up with a snazzy name, I&rsquo;ll probably put an extended
version of this library up on crates.io. Anyway, the code I cover in
this blog post is pared down to the bare essentials, and so it doesn&rsquo;t
support (e.g.) enumerating incoming edges to a node, or attach
arbitrary data to nodes/edges, etc. It would be easy to extend it to
support that sort of thing, however.</p>
<!-- more -->
<h3 id="the-high-level-idea">The high-level idea</h3>
<p>The high-level idea is that we will represent a &ldquo;pointer&rdquo; to a node or
edge using an index. A graph consists of a vector of nodes and a
vector of edges (much like the mathematical description <code>G=(V,E)</code> that
you often see):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Graph</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">nodes</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">NodeData</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">edges</span>: <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">EdgeData</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Each node is identified by an index. In this version, indices are just
plain <code>usize</code> values. In the real code, I prefer a struct wrapper just
to give a bit more type safety.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">type</span> <span class="nc">NodeIndex</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kt">usize</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">NodeData</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">first_outgoing_edge</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="n">EdgeIndex</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Each node just contains an optional edge index, which is the start of
a linked list of outgoing edges. Each edge is described by the
following structure:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">type</span> <span class="nc">EdgeIndex</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kt">usize</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">EdgeData</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">target</span>: <span class="nc">NodeIndex</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">next_outgoing_edge</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="n">EdgeIndex</span><span class="o">&gt;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>As you can see, an edge contains a target node index and an optional
index for the next outgoing edge. All edges in a particular linked
list share the same source, which is implicit. Thus there is a linked
list of outgoing edges for each node that begins in the node data for
the source and is threaded through each of the edge datas.</p>
<p>The entire structure is shown in this diagram, which depicts a simple
example graph and the various data structures. Node indices are
indicated by a number like <code>N3</code> and edge indices by a number like
<code>E2</code>. The fields of each <code>NodeData</code> and <code>EdgeData</code> are shown.</p>
<pre tabindex="0"><code>Graph:
    N0 ---E0---&gt; N1 ---E1---&gt; 2
    |                         ^
    E2                        |
    |                         |
    v                         |
    N3 ----------E3-----------+
    
Nodes (NodeData):
  N0 { Some(E0) }     
  N1 { Some(E1) }
  N2 { None     } 
  N3 { Some(E2) } 
  
Edges:
  E0 { N1, Some(E2) }
  E1 { N2, None     }
  E2 { N3, None     }
  E3 { N2, None     }
</code></pre><h3 id="growing-the-graph">Growing the graph</h3>
<p>Writing methods to grow the graph is pretty straightforward. For
example, here is the routine to add a new node:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Graph</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">add_node</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">NodeIndex</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">index</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">nodes</span><span class="p">.</span><span class="n">len</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">nodes</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="n">NodeData</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">first_outgoing_edge</span>: <span class="nb">None</span> <span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">index</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This routine will add an edge between two nodes (for simplicity, we
don&rsquo;t bother to check for duplicates):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Graph</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">add_edge</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">source</span>: <span class="nc">NodeIndex</span><span class="p">,</span><span class="w"> </span><span class="n">target</span>: <span class="nc">NodeIndex</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">edge_index</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">edges</span><span class="p">.</span><span class="n">len</span><span class="p">();</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">node_data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">nodes</span><span class="p">[</span><span class="n">source</span><span class="p">];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="bp">self</span><span class="p">.</span><span class="n">edges</span><span class="p">.</span><span class="n">push</span><span class="p">(</span><span class="n">EdgeData</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">target</span>: <span class="nc">target</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="n">next_outgoing_edge</span>: <span class="nc">node_data</span><span class="p">.</span><span class="n">first_outgoing_edge</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">});</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">node_data</span><span class="p">.</span><span class="n">first_outgoing_edge</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">index</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Finally, we can write an iterator to enumerate the successors of a
given node, which just walks down the linked list:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Graph</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">successors</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">source</span>: <span class="nc">NodeIndex</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nc">Successors</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="kd">let</span><span class="w"> </span><span class="n">first_outgoing_edge</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">nodes</span><span class="p">[</span><span class="n">source</span><span class="p">].</span><span class="n">first_outgoing_edge</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="n">Successors</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">graph</span>: <span class="nc">self</span><span class="p">,</span><span class="w"> </span><span class="n">current_edge_index</span>: <span class="nc">first_outgoing_edge</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">pub</span><span class="w"> </span><span class="k">struct</span> <span class="nc">Successors</span><span class="o">&lt;</span><span class="na">&#39;graph</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">graph</span>: <span class="kp">&amp;</span><span class="na">&#39;graph</span> <span class="nc">Graph</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="n">current_edge_index</span>: <span class="nb">Option</span><span class="o">&lt;</span><span class="n">EdgeIndex</span><span class="o">&gt;</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="na">&#39;graph</span><span class="o">&gt;</span><span class="w"> </span><span class="nb">Iterator</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Successors</span><span class="o">&lt;</span><span class="na">&#39;graph</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Item</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">NodeIndex</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">fn</span> <span class="nf">next</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span>-&gt; <span class="nb">Option</span><span class="o">&lt;</span><span class="n">NodeIndex</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="k">match</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">current_edge_index</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nb">None</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="nb">None</span><span class="p">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="nb">Some</span><span class="p">(</span><span class="n">edge_num</span><span class="p">)</span><span class="w"> </span><span class="o">=&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="kd">let</span><span class="w"> </span><span class="n">edge</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">&amp;</span><span class="bp">self</span><span class="p">.</span><span class="n">graph</span><span class="p">.</span><span class="n">edges</span><span class="p">[</span><span class="n">edge_num</span><span class="p">];</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="bp">self</span><span class="p">.</span><span class="n">current_edge_index</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">edge</span><span class="p">.</span><span class="n">next_outgoing_edge</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">                </span><span class="nb">Some</span><span class="p">(</span><span class="n">edge</span><span class="p">.</span><span class="n">target</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><h3 id="advantages">Advantages</h3>
<p>This approach plays very well to Rust&rsquo;s strengths. This is because,
unlike an <code>Rc</code> pointer, an index alone is not enough to mutate the
graph: you must use one of the <code>&amp;mut self</code> methods in the graph. This
means that can track the mutability of the graph as a whole in the
same way that it tracks the mutability of any other data structure.</p>
<p>As a consequence, graphs implemented this way can easily be sent
between threads and used in data-parallel code (any graph shared
across multiple threads will be temporarily frozen while the threads
are active). Similarly, you are statically prevented from modifying
the graph while iterating over it, which is often desirable. If we
were to use <code>Rc</code> nodes with <code>RefCell</code>, this would not be possible &ndash;
we&rsquo;d need locks, which feels like overkill.</p>
<p>Another advantage of this apprach over the <code>Rc</code> approach is
efficiency: the overall data structure is very compact. There is no
need for a separate allocation for every node, for example (since they
are just pushes into a vector, additions to the graph are O(1),
amortized). In fact, many C libaries that manipulate graphs also use
indices, for this very reason.</p>
<h3 id="disadvantages">Disadvantages</h3>
<p>The primary disadvantage comes about if you try to remove things from
the graph. The problem then is that you must make a choice: either you
reuse the node/edge indices, perhaps by keeping a free list, or else
you leave a placeholder. The former approach leaves you vulnerable to
&ldquo;dangling indices&rdquo;, and the latter is a kind of leak. This is
basically exactly analogous to malloc/free. Another similar problem
arises if you use the index from one graph with another graph (you can
mitigate that with fancy type tricks, but in my experience it&rsquo;s not
really worth the trouble).</p>
<p>However, there are some important qualifiers here:</p>
<ul>
<li>It frequently happens that you don&rsquo;t have to remove nodes or edges
from the graph.  Often you just want to build up a graph and use it
for some analysis and then throw it away. In this case the danger is
much, much less.</li>
<li>The danger of a &ldquo;dangling index&rdquo; is much less than a traditional
dangling pointer. For example, it can&rsquo;t cause memory unsafety.</li>
</ul>
<p>Basically I find that this is a <em>theoretical problem</em> but for many use
cases, it&rsquo;s not a <em>practical</em> one.</p>
<p>The big exception would be if a long-lived graph is the heart of your
application. In that case, I&rsquo;d probably go with a <code>Rc</code> (or maybe
<code>Arc</code>) based approach, or perhaps even a hybrid &ndash; that is, use
indices as I&rsquo;ve shown here, but reference count the indices too. This
would preserve the data-parallel advantages.</p>
<h3 id="conclusion">Conclusion</h3>
<p>The key insights in this approach are:</p>
<ul>
<li>indices are often a compact and convenient way to represent complex
data structures;</li>
<li>they play well with multithreaded code and with ownership;</li>
<li>but they also carry some risks, particularly for long-lived data
structures, where there is an increased change of indices being
misused between data structures or leaked.</li>
</ul>
]]></content><category scheme="https://smallcultfollowing.com/babysteps/categories/rust" term="rust" label="Rust"/></entry><entry><title type="html">Little Orphan Impls</title><link href="https://smallcultfollowing.com/babysteps/blog/2015/01/14/little-orphan-impls/?utm_source=atom_feed" rel="alternate" type="text/html"/><id>https://smallcultfollowing.com/babysteps/blog/2015/01/14/little-orphan-impls/</id><published>2015-01-14T00:00:00+00:00</published><updated>2015-01-14T14:03:45-05:00</updated><content type="html"><![CDATA[<p>We&rsquo;ve recently been doing a lot of work on Rust&rsquo;s <em>orphan rules</em>,
which are an important part of our system for guaranteeing <em>trait
coherence</em>. The idea of trait coherence is that, given a trait and
some set of types for its type parameters, there should be exactly one
impl that applies. So if we think of the trait <code>Show</code>, we want to
guarantee that if we have a trait reference like <code>MyType : Show</code>, we
can uniquely identify a particular impl. (The alternative to coherence
is to have some way for users to identify which impls are in scope at
any time.  It has <a href="https://mail.mozilla.org/pipermail/rust-dev/2011-December/001036.html">its own complications</a>; if you&rsquo;re curious for
more background on why we use coherence, you might find this
<a href="https://mail.mozilla.org/pipermail/rust-dev/2011-December/thread.html#1036">rust-dev thread</a> from a while back to be interesting
reading.)</p>
<p>The role of the <em>orphan rules</em> in particular is basically to prevent
you from implementing <em>external traits for external types</em>. So
continuing our simple example of <code>Show</code>, if you are defining your own
library, you could not implement <code>Show</code> for <code>Vec&lt;T&gt;</code>, because both
<code>Show</code> and <code>Vec</code> are defined in the standard library. But you <em>can</em>
implement <code>Show</code> for <code>MyType</code>, because you defined <code>MyType</code>. However,
if you define your own trait <code>MyTrait</code>, then you can implement
<code>MyTrait</code> for any type you like, including external types like
<code>Vec&lt;T&gt;</code>. To this end, the orphan rule intuitively says &ldquo;either the
trait must be local or the self-type must be local&rdquo;.</p>
<p>More precisely, the orphan rules are targeting the case of two
&ldquo;cousin&rdquo; crates. By cousins I mean that the crates share a common
ancestor (i.e., they link to a common library crate). This would be
libstd, if nothing else. That ancestor defines some trait. Both of the
crates are implementing this common trait using their own local types
(and possibly types from ancestor crates, which may or may not be in
common). But neither crate is an ancestor of the other: if they were,
the problem is much easier, because the descendant crate can see the
impls from the ancestor crate.</p>
<p>When we extended the trait system to <a href="http://smallcultfollowing.com/babysteps/blog/2014/09/30/multi-and-conditional-dispatch-in-traits/">support</a>
<a href="https://github.com/rust-lang/rfcs/blob/master/text/0195-associated-items.md">multidispatch</a>, I confess that I originally didn&rsquo;t give the
orphan rules much thought. It seemed like it would be straightforward
to adapt them. Boy was I wrong! (And, I think, our original rules were
kind of unsound to begin with.)</p>
<p>The purpose of this post is to lay out the current state of my
thinking on these rules. It sketches out a number of variations and
possible rules and tries to elaborate on the limitations of each
one. It is intended to serve as the seed for a discussion in the
<a href="http://discuss.rust-lang.org/t/orphan-rules/1322">Rust discusstion forums</a>.</p>
<!-- more -->
<h3 id="the-first-totally-wrong-attempt">The first, totally wrong, attempt</h3>
<p>The first attempt at the orphan rules was just to say that an impl is
legal if a local type appears somewhere. So, for example, suppose that I
define a type <code>MyBigInt</code> and I want to make it addable to integers:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Add</span><span class="o">&lt;</span><span class="kt">i32</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">MyBigInt</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Add</span><span class="o">&lt;</span><span class="n">MyBigInt</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="kt">i32</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Under these rules, these two impls are perfectly legal, because
<code>MyBigInt</code> is local to the current crate. However, the rules also
permit an impl like this one:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Add</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">MyBigInt</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now the problems arise because those same rules <em>also</em> permit an impl
like this one (in another crate):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Add</span><span class="o">&lt;</span><span class="n">YourBigInt</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Now we have a problem because both impls are applicable to
<code>Add&lt;YourBigInt&gt; for MyBigInt</code>.</p>
<p>In fact, we don&rsquo;t need multidispatch to have this problem. The same
situation can arise with <code>Show</code> and tuples:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Show</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="n">T</span><span class="p">,</span><span class="w"> </span><span class="n">MyBigInt</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="c1">// Crate A
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Show</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="n">YourBigInt</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="c1">// Crate B
</span></span></span></code></pre></div><p>(In fact, multidispatch is really nothing than a compiler-supported
version of implementing a trait for a tuple.)</p>
<p>The root of the problem here lies in our definition of &ldquo;local&rdquo;, which
completely ignored type parameters. Because type parameters can be
instantiated to arbitrary types, they are obviously special, and must
be considered carefully.</p>
<h3 id="the-ordered-rule">The ordered rule</h3>
<p>This problem was first brought to our attention by <a href="https://github.com/arielb1">arielb1</a>, who
filed <a href="https://github.com/rust-lang/rust/issues/19470">Issue 19470</a>. To resolve it, he proposed a rule that I
will call the <em>ordered rule</em>. The ordered rule goes like this:</p>
<ol>
<li>Write out all the type parameters to the trait, starting with <code>Self</code>.</li>
<li>The name of some local struct or enum must appear on that line before the first
type parameter.
<ul>
<li><em>More formally:</em> When visiting the types in pre-order, a local type must be visited
before any type parameter.</li>
</ul>
</li>
</ol>
<p>In terms of the examples I gave above, this rule permits the following impls:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Add</span><span class="o">&lt;</span><span class="kt">i32</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">MyBigInt</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Add</span><span class="o">&lt;</span><span class="n">MyBigInt</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="kt">i32</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Add</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">MyBigInt</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>However, it avoids the quandry we saw before because it rejects this impl:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Add</span><span class="o">&lt;</span><span class="n">YourBigInt</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="p">.</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This is because, if we wrote out the type parameters in a list, we would get:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="n">T</span><span class="p">,</span><span class="w"> </span><span class="n">YourBigInt</span><span class="w">
</span></span></span></code></pre></div><p>and, as you can see, <code>T</code> comes first.</p>
<p>This rule is actually pretty good. It meets most of the requirements
I&rsquo;m going to unearth.  But it has some problems. The first is that it
feels strange; it feels like you should be able to reorder the type
parameters on a trait without breaking everything (we will see that
this is not, in fact, obviously true, but it was certainly my first
reaction).</p>
<p>Another problem is that the rule is kind of fragile. It can easily
reject impls that don&rsquo;t seem particularly different from impls that it
accepts. For example, consider the case of the <a href="https://github.com/reem/rust-modifier"><code>Modifier</code> trait</a>
that is used in hyper and iron. As you can see in <a href="https://github.com/rust-lang/rust/issues/20974">this issue</a>,
iron wants to be able to define a <code>Modifier</code> impl like the following:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Response</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="w"> </span><span class="n">Modifier</span><span class="o">&lt;</span><span class="n">Response</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Vec</span><span class="o">&lt;</span><span class="kt">u8</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This impl is accepted by the ordered rule (thre are no type parameters at all,
in fact). However, the following impl, which seems very similar and equally
likely (in the abstract), would <em>not</em> be accepted:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">Response</span><span class="p">;</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Modifier</span><span class="o">&lt;</span><span class="n">Response</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Vec</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>This is because the type parameter <code>T</code> appears before the local type
(<code>Response</code>). Hmm. It doesn&rsquo;t really matter if <code>T</code> appears in the local type,
either; the following would also be rejected:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">struct</span> <span class="nc">MyHeader</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="o">..</span><span class="p">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Modifier</span><span class="o">&lt;</span><span class="n">MyHeader</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Vec</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="o">..</span><span class="w"> </span><span class="p">}</span><span class="w">
</span></span></span></code></pre></div><p>Another trait that couldn&rsquo;t be handled properly is the <code>BorrowFrom</code> trait
in the standard library. There a number of impls like this one:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">BorrowFrom</span><span class="o">&lt;</span><span class="n">Rc</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w">
</span></span></span></code></pre></div><p>This impl fails the ordered check because <code>T</code> comes first. We can make
it pass by switching the order of the parameters, so that the
<code>BorrowFrom</code> trait becomes <code>Borrow</code>.</p>
<p>A final &ldquo;near-miss&rdquo; occurred in the standard library with the <code>Cow</code>
type.  Here is an impl from <code>libcollections</code> of <code>FromIterator</code> for a
copy-on-write vector:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">FromIterator</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">Cow</span><span class="o">&lt;</span><span class="na">&#39;a</span><span class="p">,</span><span class="w"> </span><span class="nb">Vec</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">,</span><span class="w"> </span><span class="p">[</span><span class="n">T</span><span class="p">]</span><span class="o">&gt;</span><span class="w">
</span></span></span></code></pre></div><p>Note that <code>Vec</code> is a local type here. This impl obeys the ordered
rule, but somewhat by accident. If the type parameters of the <code>Cow</code>
trait were in a different order, it would not, because then <code>[T]</code>
would precede <code>Vec&lt;T&gt;</code>.</p>
<h3 id="the-covered-rule">The covered rule</h3>
<p>In response to these shortcomings, I proposed an alternative rule that
I&rsquo;ll call the <em>covered</em> rule. The idea of the covered rule was to say
that (1) the impl must have a local type somewhere and (2) a type
parameter can only appear in the impl if the type parameter is
<em>covered</em> by a local type. Covered means that it appears &ldquo;inside&rdquo; the
type: so <code>T</code> is covered by <code>MyVec</code> in the type <code>MyVec&lt;T&gt;</code> or
<code>MyBox&lt;Box&lt;T&gt;&gt;</code>, but not in <code>(T, MyVec&lt;int&gt;)</code>. This rule has the
advantage of having nothing to do with ordering and it has a certain
intution to it; any type parameters that appear in your impls have to
be tied to something local.</p>
<p>This rule
<a href="https://github.com/rust-lang/rust/issues/19470#issuecomment-66846120">turns out to give us the required orphan rule guarantees</a>. To
see why, consider this example:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Foo</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">A</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="c1">// Crate A
</span></span></span><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">U</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Foo</span><span class="o">&lt;</span><span class="n">B</span><span class="o">&lt;</span><span class="n">U</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">U</span><span class="w"> </span><span class="c1">// Crate B
</span></span></span></code></pre></div><p>If you tried to make these two impls apply to the same type, you wind
up with infinite types. After all, <code>T = B&lt;U&gt;</code>, but <code>U=A&lt;T&gt;</code>, and hence
you get <code>T = B&lt;A&lt;T&gt;&gt;</code>.</p>
<p>Unlike the previous rule, this rule happily accepts the <code>BorrowFrom</code>
trait impls:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">BorrowFrom</span><span class="o">&lt;</span><span class="n">Rc</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">T</span><span class="w">
</span></span></span></code></pre></div><p>The reason is that the type parameter <code>T</code> here is covered by the
(local) type <code>Rc</code>.</p>
<p>However, after implementing this rule, we found out that it actually
prohibits a lot of other useful patterns. The most important of them is
the so-called <em>auxiliary</em> pattern, in which a trait takes a type parameter
that is a kind of &ldquo;configuration&rdquo; and is basically orthogonal to the types
that the trait is implemented for. An example is the <code>Hash</code> trait:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">H</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Hash</span><span class="o">&lt;</span><span class="n">H</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">MyStruct</span><span class="w">
</span></span></span></code></pre></div><p>The type <code>H</code> here represents the hashing function that is being used. As you can imagine,
for most types, they will work with <em>any</em> hashing function. Sadly, this impl is rejected,
because <code>H</code> is not covered by any local type. You could make it work by adding a parameter
<code>H</code> to <code>MyStruct</code>:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">H</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Hash</span><span class="o">&lt;</span><span class="n">H</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">MyStruct</span><span class="o">&lt;</span><span class="n">H</span><span class="o">&gt;</span><span class="w">
</span></span></span></code></pre></div><p>But that is very weird, because now when we create our struct we are
also deciding which hash functions can be used with it. You can also
make it work by moving the hash function parameter <code>H</code> to the <code>hash</code>
method itself, but then <em>that</em> is limiting.  It makes the <code>Hash</code> trait
not object safe, for one thing, and it also prohibits us from writing
types that <em>are</em> specialized to particular hash functions.</p>
<p>Another similar example is indexing. Many people want to make types indexable
by any integer-like thing, for example:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-rust" data-lang="rust"><span class="line"><span class="cl"><span class="k">impl</span><span class="o">&lt;</span><span class="n">I</span>:<span class="nc">Int</span><span class="p">,</span><span class="w"> </span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="n">Index</span><span class="o">&lt;</span><span class="n">I</span><span class="o">&gt;</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="nb">Vec</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="k">type</span> <span class="nc">Output</span><span class="w"> </span><span class="o">=</span><span class